microsoft / clrmd

Microsoft.Diagnostics.Runtime is a set of APIs for introspecting processes and dumps.
MIT License
1.05k stars 254 forks source link

_isVirtual has to be set as true in the context of DataTarget.LoadDump. #1279

Open stjeong opened 1 month ago

stjeong commented 1 month ago

Description

At .\Microsoft.Diagnostics.Runtime\DataReaders\Core\CoreDumpReader.cs,

CreateModuleInfo always create PEModuleInfo instance with "virtualHint == false",

private ModuleInfo CreateModuleInfo(ElfLoadedImage image)
{
    using ElfFile? file = image.Open();

    string path = image.FileName.Replace(" (deleted)", "");
    if (file is not null)
    {
        long size = image.Size > long.MaxValue ? long.MaxValue : unchecked((long)image.Size);
        return new ElfModuleInfo(this, file, image.BaseAddress, size, path);
    }

    return new PEModuleInfo(this, image.BaseAddress, path, false); // <--- HERE!
}

When dll image in memory dump file has 'export data directory', this code lead to almost infinite loop because of invalid ImageExportDirectory instance given by TryGetExportSymbol method.

// .\Microsoft.Diagnostics.Runtime\Utilities\PEImage\PEImage.cs
public bool TryGetExportSymbol(string symbolName, out ulong offset)
{
    int nameIndex = 0;
    try
    {
        ImageDataDirectory exportTableDirectory = ExportDirectory;
        if (exportTableDirectory.VirtualAddress != 0 && exportTableDirectory.Size != 0)
        {
            if (TryRead(RvaToOffset(exportTableDirectory.VirtualAddress), out ImageExportDirectory exportDirectory)) // <--- HERE!

     // ...omitted for brevity...

At the code of "RvaToOffset" method,

public int RvaToOffset(int virtualAddress)
{
    if (virtualAddress < 4096)
        return virtualAddress;

    if (_isVirtual) // <--- HERE!
        return virtualAddress;

    ImageSectionHeader[] sections = ReadSections(); // <---- Should not run this code in the context of memory dump
    for (int i = 0; i < sections.Length; i++)
    {
        ref ImageSectionHeader section = ref sections[i];
        if (section.VirtualAddress <= virtualAddress && virtualAddress < section.VirtualAddress + section.VirtualSize)
            return (int)(section.PointerToRawData + ((uint)virtualAddress - section.VirtualAddress));
    }

    return -1;
}

Because _isVirtual is false, it doesn't run second if's "return virtualAddress", instead it calculates the value as RVA for PE offset, like loading from FILE directly.

Of course, the return value is incorrect address, the caller(TryGetExportSymbol) fills exportDirectory with garbage.

Other information

This problem is not a problem when loading memory dump for .NET 6 or later, because most .NET 6 DLL (for example: System.Private.CoreLib.dll) has no export data directory.

However, if the memory dump contains user DLL with export data directory, DataTarget.LoadDump and DataTarget.ClrVersions runs a lot of loop (because the loop count in TryRead method is filled from garbage). And for .NET 5 memory dump in linux, you can meet the problem, because System.Private.CoreLib.dll has export data directory, whose size is 0x43.

### Tasks
stjeong commented 1 month ago

This dump file is based on a basic console app.

If you open the dump file using this code,

using (DataTarget target = DataTarget.LoadDump(dmpFilePath))
{
    ImmutableArray<ClrInfo> clrVersions = target.ClrVersions; // have to wait for so long time, because of the loop
}

You can't miss the problem.

net5_console.zip

stjeong commented 1 month ago

For dump file in windows, Reader class is defined in ".\Microsoft.Diagnostics.Runtime\DataReaders\Minidump\MinidumpReader.cs" set "isVirtual" argument as true.

        public IEnumerable<ModuleInfo> EnumerateModules()
        {
            return from module in _minidump.EnumerateModuleInfo()
                   select new PEModuleInfo(this, module.BaseOfImage, module.ModuleName ?? "", 
                      true,  // !!!! HERE
                      module.DateTimeStamp, module.SizeOfImage);
        }