0xd4d / dnlib

Reads and writes .NET assemblies and modules
MIT License
2.18k stars 587 forks source link

PDB Generated is wrong #530

Closed jespanag closed 1 year ago

jespanag commented 1 year ago

Hello!

I'm doing a simple test, but it's failing and i can't find the solution.

The problem is when I load (ModuleDef.Load) a module, with his correct and original .pdb, and i save it (no modifications in the assembly) the new pdb generated is malformed for this application.

It's malformed (i think) because if you try to load the .pdb with mono.cecil or a dissambler like dotPeek the .pdb can't be loaded.

Assembly to test the .pdb is a .NET 7 Maui app (attaching app and .pdb): MauiTest.zip

To reproduce:

var asm = ModuleDefMD.Load(".../MauiTest.dll");
//asm.LoadPdb();
var writter = new ModuleWriterOptions(assemblyLoad);
writter.WritePdb = true;
//writter.PdbFileName =  ".../dnlib_test/MauiTest.pdb";
asm.Write(".../dnlib_test/MauiTest.dll", writter);

//Ok, assembly wrote, now load .pdb with other lib like Mono.Cecil:

var readerParameters = new ReaderParameters
{
    //AssemblyResolver = new XamlCAssemblyResolver(),
    ReadWrite = true,
    ReadSymbols =  true,
}

AssemblyDefinition.ReadAssembly(".../dnlib_test/MauiTest.dll", readerParameters) //Fails:

Exception:

at Mono.Cecil.PE.ByteBuffer.ReadUInt32()   at Mono.Cecil.MetadataReader.InitializeCustomDebugInformations()   at Mono.Cecil.MetadataReader.GetCustomDebugInformation(ICustomDebugInformationProvider provider)   at Mono.Cecil.Cil.PortablePdbReader.ProcessDebugHeader(ImageDebugHeader header)   at Mono.Cecil.ModuleDefinition.ReadSymbols(ISymbolReader reader, Boolean throwIfSymbolsAreNotMaching)   at Mono.Cecil.ModuleReader.ReadSymbols(ModuleDefinition module, ReaderParameters parameters)   at Mono.Cecil.ModuleReader.CreateModule(Image image, ReaderParameters parameters)   at Mono.Cecil.ModuleDefinition.ReadModule(String fileName, ReaderParameters parameters)   at Mono.Cecil.AssemblyDefinition.ReadAssembly(String fileName, ReaderParameters parameters)   at Program.<Main>$(String[] args)

image

or with dotPeek:

image

I don't quite understand why this can happen, I have tried to look at the dnlib code regarding this, but dnlib as such does not fail to write the .pdb so I assume that perhaps it is something that it writes but it is not entirely correct.

Any idea?

Thanks,

sunnamed434 commented 1 year ago

As far as I know, PDB reading in dnlib works incorrectly the same as most things due to most of the things being out of support for a long time

ElektroKill commented 1 year ago

As far as I know, PDB reading in dnlib works incorrectly the same as most things due to most of the things being out of support for a long time

This statement is outright misinformation, dnlib does work properly and there are minor bugs that pop up here and there. Stating that “most things” don’t work correctly is an extreme exaggeration and misinformation. The specifications regarding .NET metadata have barely changed with only some minor augments made which are actively being applied to dnlib via PRs. PDB reading works properly and this is most likely just a bug in the PDB writer. I will try and look into it and make PR later.

jespanag commented 1 year ago

As far as I know, PDB reading in dnlib works incorrectly the same as most things due to most of the things being out of support for a long time

This statement is outright misinformation, dnlib does work properly and there are minor bugs that pop up here and there. Stating that “most things” don’t work correctly is an extreme exaggeration and misinformation. The specifications regarding .NET metadata have barely changed with only some minor augments made which are actively being applied to dnlib via PRs. PDB reading works properly and this is most likely just a bug in the PDB writer. I will try and look into it and make PR later.

It would help me a lot to get information and be able to work together to fix this and upload a PR so it doesn't happen to anyone else.

I have been investigating, and the error only happens with portable pdb (cross-platform) when the original pdb generated is of type windows dnlib works correctly, the problem with this is that it is impossible to make it work on MacOs.

The problem is that if you manipulate a Xamarin or Maui library with dnlib, by default dotnet compiles these libraries using Mono.Cecil, so it makes impossible to use dnlib (at least with .pdb files, but if you don't generate the .pdb the compiler fails because it doesn't find it).

Any clue what could be wrong, have you been able to reproduce the error?

wtfsck commented 1 year ago

@ElektroKill Let me know if you want me to add you to this repo (write access).

jespanag commented 1 year ago

@ElektroKill Let me know if you want me to add you to this repo (write access).

@wtfsck I'll try to help also, i'll investigate this weekend, if you have any resources that can be useful for me will be cool!

ElektroKill commented 1 year ago

Had a brief look at it today, it looks like there is a bug in dnlib that causes the CustomDebugInformation rows in the PDB metadata to become corrupt.

Reading the raw values of the rows in the PDB metadata before and after reveals it more, even leading to dnlib crashing image

This test was performed by running the following code after obtaining a Metadata object for the PortablePDB metadata:

for (uint i = 1; ; i++) {
    if (!pdbMd2.TablesStream.TryReadCustomDebugInformationRow(i, out var cdiRow)) {
        break;
    }

    Console.WriteLine("{0}: Kind: {1}, Value: {2}, Parent: {3}", i, cdiRow.Kind, cdiRow.Value, cdiRow.Parent);
}

I will continue to investigate this issue.

ElektroKill commented 1 year ago

Okay, I think I have identified the issue.

After analyzing how the CustomDebugInformation rows are read and written in a debugger, I noticed that when dnlib is reading the CustomDebugInformation row, the column sizes are 4 bytes, 2 bytes, and 2 bytes for parent, kind, and value respectively. However, when dnlib is writing the rows, it seems to use 2 bytes for all the columns in the CustomDebugInformation table. When the PDB is then read again in dnlib, it again will try to read the parent column as 4 bytes as before when in reality it had just written it as a 2 byte column resulting in the corruption seen in the image.

The initial read of the assembly using 4 bytes for Parent column: image

Writing code using 2 bytes for Parent column: image

Reading the assembly writing by dnlib using dnlib itself: image

I try to identify the cause of the carrying column sizes now!