wisk / medusa

An open source interactive disassembler
Other
1.04k stars 92 forks source link

Question: FormatDisassembly #21

Open saeschdivara opened 9 years ago

saeschdivara commented 9 years ago

Hey,

I was trying to print out every single line of the disassembly with formating. It got really slow in the following method:

// OPTIMIZEME: This function could be very time consumming (use deque?)
bool MappedMemoryArea::_GetPreviousCellOffset(TOffset Offset, TOffset& rPreviousOffset) const
{
  while (Offset != 0x0)
  {
    --Offset;
    if (m_Cells[Offset] != nullptr)
    {
      rPreviousOffset = Offset;
      return true;
    }
  }

  return false;
}

Now my question is if I can just switch the type of m_Cells to a deque or do I need to change something else?

wisk commented 9 years ago

Hey Sascha,

I need more details in order to help you, for instance: why do you need to get the previous address? If you're looking for an example to disassemble everything, you might want to take a look at https://github.com/wisk/medusa/blob/dev/src/test/test_arch.cpp#L109 Let me know if it what you are looking for. :)

saeschdivara commented 9 years ago

Well, maybe I have totally wrong idea how to print out the whole code formatted:

    medusa::Address FirstAddr = _medusa.GetDocument().GetFirstAddress();
    medusa::Address LastAddr  = _medusa.GetDocument().GetLastAddress();

    medusa::PrintData Print;
    medusa::FormatDisassembly FmtDisasm(_medusa, Print);

    medusa::u32 m_FormatFlags = medusa::FormatDisassembly::ShowAddress |
                        medusa::FormatDisassembly::AddSpaceBeforeXref |
                        medusa::FormatDisassembly::Indent;

    QFile assemblyFile("test.asm");
    if (!assemblyFile.open(QIODevice::ReadWrite | QIODevice::Text))
        return;

    qDebug() << "Created file at: " << assemblyFile.fileName();

    QTextStream out(&assemblyFile);

    int iLineNumber = 0;

    while (FirstAddr != LastAddr) {

        while (!_medusa.GetDocument().GetNextAddress(FirstAddr, FirstAddr))
            qDebug() << "Nothing found at " << FirstAddr.GetOffset();

        FmtDisasm(FirstAddr, m_FormatFlags, 1);

        std::string Line = Print.GetTexts();
        out << QString::fromStdString(Line);

        iLineNumber++;

        if (iLineNumber == 100) {
            iLineNumber = 0;
            assemblyFile.flush();
        }
    }

    assemblyFile.close();

I just created a menu item to be able to activate it after all the work is done. I want a normal assembly file to be able to analyse it further with maybe something else...

wisk commented 9 years ago

Your code looks good to me, is your code slower than medusa_text? Actually dumping disassembled text is quite long, especially here since Medusa has to re-disassemble instruction each time it's necessary. I like your idea to be able to save disassembled code into a file, unfortunately I cannot add this feature right now since it would require to implement blocking asynchronous task.

saeschdivara commented 9 years ago

Well, it hasn't finished after 12 hours for the same file I have giving you a link to... So it is very slow. The file is 316.4 mb big. I will try it now for myself.

wisk commented 9 years ago

wow 12 hours, something bad happens. :p Could you check addresses on the log file? I suspect Document::GetNextAddress fails and always returns the same address. That would explain the endless loop.

saeschdivara commented 9 years ago

Well, I guess that must be true... I have updated my code so it may perform better but I am not sure if it really works:

class MemoryDisAssemblerTask : public QRunnable
{
public:

    MemoryDisAssemblerTask(medusa::Medusa * medusa, medusa::Address const rAddress)
        : m_pMedusa(medusa), m_Address(rAddress) {
    }

    void run()
    {
        medusa::MemoryArea const* rMemArea = m_pMedusa->GetDocument().GetMemoryArea(m_Address);

        qDebug() << "Hello world from thread" << QThread::currentThread();
        qDebug() << m_Address.GetOffset();

        QString assemblyName = QString::fromStdString( rMemArea->GetName() ) + QString(".asm");

        if (assemblyName.startsWith('.')) {
            assemblyName = QString("sec_") + assemblyName;
        }

        QFile assemblyFile(assemblyName);
        if (!assemblyFile.open(QIODevice::ReadWrite | QIODevice::Text))
            return;

        QTextStream out(&assemblyFile);

        qDebug() << "Created file at: " << assemblyFile.fileName();

        medusa::Address FirstAddr = m_Address;

        int iLineNumber = 0;

        std:string MemoryName = rMemArea->GetName();

        medusa::PrintData Print;
        medusa::FormatDisassembly FmtDisasm(*m_pMedusa, Print);

        medusa::u32 m_FormatFlags = medusa::FormatDisassembly::ShowAddress |
                            medusa::FormatDisassembly::AddSpaceBeforeXref |
                            medusa::FormatDisassembly::Indent;

        QString qMemoryName = QString::fromStdString(MemoryName);

        medusa::TOffset EndOffset = FirstAddr.GetOffset() + rMemArea->GetSize();
        qDebug() << qMemoryName << " EndOffset: " << EndOffset;

        while (true) {
            qDebug() << qMemoryName << FirstAddr.GetOffset();

            FmtDisasm(FirstAddr, m_FormatFlags, 1);

            std::string Line = Print.GetTexts();
            out << QString::fromStdString(Line);

            iLineNumber++;

            if (iLineNumber == 1000) {
                iLineNumber = 0;
                assemblyFile.flush();
            }

            if (FirstAddr.GetOffset() == EndOffset) {
                qDebug() << "We are at the end of " << qMemoryName;
                break;
            }

            if (!m_pMedusa->GetDocument().GetNextAddress(FirstAddr, FirstAddr)) {
                qDebug() << "Nothing found for " << qMemoryName;
                break;
            }
        }

        assemblyFile.flush();
        assemblyFile.close();
    }

private:
    medusa::Address m_Address;
    medusa::Medusa * m_pMedusa;
};

void MainWindow::on_actionSimpleAction_triggered()
{
    _medusa.GetDocument().ForEachMemoryArea([&](medusa::MemoryArea const& rMemArea)
    {
        MemoryDisAssemblerTask *hello = new MemoryDisAssemblerTask(
                    &_medusa,
                    rMemArea.GetBaseAddress()
                    );

        QThreadPool::globalInstance()->start(hello);
    });
}
wisk commented 9 years ago

It looks great, feel free to include this feature on medusa. :)

saeschdivara commented 9 years ago

Well, it is not yet finish since there are still some problems but I think I can fix most and than I can make a pull request

saeschdivara commented 9 years ago

Is it correct that only with this code, I can check where the end of the section is:

if (!rMemArea->GetNextAddress(FirstAddr, FirstAddr)) {
                qDebug() << "Nothing found for " << qMemoryName;
                break;
            }
wisk commented 9 years ago

Actually, no since Document::GetNextAddress method should handle the case when the next address is located in the next MemoryArea (i.e. Section in PE case). To tell the truth, I'm not very confident regarding the reliability of this method (actual implementation is located in both bool TextDatabase::_MoveAddressForward(Address const& rAddress, Address& rMovedAddress, s64 Offset) const and bool TextDatabase::_MoveAddressBackward(Address const& rAddress, Address& rMovedAddress, s64 Offset) const). It may explain why it doesn't work when you only rely on it. If you find a special case (executable + address) where it doesn't work as expected, feel free to share, I'll provide a fix ASAP.

saeschdivara commented 9 years ago

Well, how could I check manually if it stops correctly?

saeschdivara commented 9 years ago

I think it works correctly but my exe-file is about 25mb in size. And that many instructions writing down is not that fast as it seems...

wisk commented 9 years ago

Probably not the better way, but that's what I would do: let it run several minutes (e.g. 30 minutes), and attach a debugger to see what actually happens. This method sucks but it's simple. BTW: Your executable looks really heavy :) I think it embeds a lot of data which slow medusa down.

saeschdivara commented 9 years ago

I think that too because during the storing I almost only see db instructions ;)