gluck / il-repack

Open-source alternative to ILMerge
Apache License 2.0
1.19k stars 217 forks source link

How to debug a merged assembly (.EXE)? #368

Closed InteXX closed 3 months ago

InteXX commented 3 months ago

I'm using a PowerShell script to merge $(TargetDir)\*.dll with $(TargetDir)\Setup.exe into a subfolder $(TargetDir)\Merged\Setup.exe.

The merge is successful, and Setup.exe runs, but I'm getting some JIT errors that don't occur during a debug run and I need to... well... debug them.

But when I attach to the process, Visual Studio's Modules window reports that the "Binary was not built with debug information." So I can't load the symbols.

I'm doing this in Debug configuration, and a .PDB is being generated. But it seems the assembly itself is missing something.

Is there a way to get ILRepack to include debug information during the merge?

KirillOsenkov commented 3 months ago

what version are you using? how do you invoke it?

KirillOsenkov commented 3 months ago

if there's a way for me to debug I'm happy to take a look

InteXX commented 3 months ago

I switched to the ILRepack MSBuild Task, here, but I'm getting the same result.

The version number there appears to align with yours, so it might be reasonable to assume it's the latest—v2.0.34.

Here's my target:

<Target Name="MergeOutput" AfterTargets="Build">
  <ItemGroup>
    <InputAssemblies Include="$(TargetDir)*.exe" />
    <InputAssemblies Include="$(TargetDir)*.dll" />
  </ItemGroup>

  <ILRepack
    Parallel="true"
    DebugInfo="true"
    InputAssemblies="@(InputAssemblies)"
    TargetKind="SameAsPrimaryAssembly"
    OutputFile="$(TargetDir)Merged\$(TargetFileName)"
  />
</Target>

Note that I'm including the DebugInfo attribute, but that hasn't made a difference. Since a PDB was being generated before with my PowerShell script that used only one argument (/out), I suspect that the default for DebugInfo is True.

I'll see if I can whip up a quick repro project. Give me a day or two—I'm busy tomorrow with other stuff.

KirillOsenkov commented 3 months ago

One thing I should mention is that you should make sure all input assemblies have a matching .pdb file next to them. ILRepack supports merging pdb files, but if the original pdb files are missing or not matching, things will obviously not work.

You need to be able to debug the original assemblies to be able to debug merged assemblies.

You can use the pdb dotnet tool to verify whether a pair of .dll and .pdb match:

dotnet tool update -g pdb
pdb lib.dll lib.pdb
InteXX commented 3 months ago

Ah, that may have something to do with it. Most of the assemblies I'm merging are Microsoft's, so I don't believe I have access to those PDBs. Another one is Autofac, and I'm pretty certain I won't be able to get that one either.

In fact, the only PDB that I do have access to is the entry point, Setup.exe—my application.

It's worth noting that I'm targeting net48 with this.

InteXX commented 3 months ago

ChatGPT tells me that I'll have to use ILSpy to decompile the newly merged assembly and recompile it to get a new (and matching) PDB. Which sounds like a recipe for failure to me, at least when trying to automate it.

Is there anything to this claim? (I've learned not to place too much trust in that thing.)

KirillOsenkov commented 3 months ago

as long as your setup.exe has a pdb you should be fine I think

InteXX commented 3 months ago

I'll need my ration of sleep first, and then I'll get that repro project together.

KirillOsenkov commented 3 months ago

you can compile your setup.exe with <DebugType>embedded</DebugType>, then the pdb will be inside the exe, and inside the merged one too.

You shouldn't need to roundtrip via ILSpy because normally VS can debug without pdbs and decompile on the fly.

InteXX commented 3 months ago

<DebugType>embedded</DebugType>

Good tip, thanks.

InteXX commented 3 months ago

I get this, even after embedding Setup.pdb in the debug build output before merging the assemblies:

image

That's from the source of Setup.exe.

One clarification, please... when VS reports "Binary was not built with debug information," what exactly does that mean?

  1. That some necessary action was not performed on the assembly itself? Or rather:
  2. That a matching .pdb couldn't be found for the assembly?

(Hat tip to your earlier tip dotnet tool update -g pdb—that came in handy for helping my understanding of all this.)

I'll get started on that repro project now. It shouldn't take me long.

KirillOsenkov commented 3 months ago

Oh interesting, this is VB. Might be related (or not??)

In VS, try going to Debug -> Windows -> Modules, find the module for Setup.exe, right-click and do Symbol Load Information...

Also run the pdb dotnet tool on the merged assembly, and paste the output here.

InteXX commented 3 months ago

Might be related (or not??)

I'm guessing not, since we're talking IL here. Correct me, please.

Here's the Symbol Load Information for the merged Setup.exe (which contains the .pdb from the pre-merged Debug build):

image

The command pdb Setup.exe only echoes the assembly name & location to the screen:

D:\Dev\Projects\Setup\bin\Debug\Merged>pdb Setup.exe D:\Dev\Projects\Setup\bin\Debug\Merged\Setup.exe

Note there is no .pdb in the Merged folder, which is expected after the embedding.

InteXX commented 3 months ago

I wasn't able to reproduce this in my test project.

My Setup.exe project is targeting net48 and some of the assemblies I'm merging are netstandard2.0. That might have something to do with it. I'll try removing those temporarily.

InteXX commented 3 months ago

Also run the pdb dotnet tool on the merged assembly, and paste the output here

Oh, I think you must've meant run the pdb tool on the assembly and its accompanying pdb. I did do that earlier, before I embedded the pdb.

Here's what I got from that:

D:\Dev\Projects\Setup\bin\Debug\Merged>pdb setup.exe setup.pdb D:\Dev\Projects\Setup\bin\Debug\Merged\setup.exe

No match D:\Dev\Projects\Setup\bin\Debug\Merged\setup.pdb: Native pdb: Microsoft C/C++ MSF 7.00

Maybe there's something wonky going on with the generation of that .pdb file? It's also quite large, weighing in at 7.63MB where the source .pdb is only 146KB.

InteXX commented 3 months ago

some of the assemblies I'm merging are netstandard2.0

Nope, that wasn't it. I added a netstandard2.0 package to my test project and I was able to break on the merged assembly.

I think a big clue is that Native pdb. Something's triggering ILRepack to create that format—one of the input assemblies, perhaps?

Here's the list:

Autofac.dll BouncyCastle.Crypto.dll Microsoft.Bcl.AsyncInterfaces.dll Microsoft.Win32.Registry.dll MoreLinq.dll Serilog.dll Serilog.Sinks.File.dll System.Buffers.dll System.Diagnostics.DiagnosticSource.dll System.IO.FileSystem.AccessControl.dll System.Memory.dll System.Numerics.Vectors.dll System.Runtime.CompilerServices.Unsafe.dll System.Security.AccessControl.dll System.Security.Principal.Windows.dll System.Threading.Channels.dll System.Threading.Tasks.Extensions.dll

See any immediate suspects? FWIW I'm eyeballing that BouncyCastle. It's only there as a transitive package, so I'll see if I can get rid of it.

InteXX commented 3 months ago

BouncyCastle Native pdb

Nope, that's not the one:

D:\Dev\Projects\GitHub\PdbRepro\PdbRepro\bin\Debug\Merged>pdb PdbRepro.exe PdbRepro.pdb D:\Dev\Projects\GitHub\PdbRepro\PdbRepro\bin\Debug\Merged\PdbRepro.exe

Guid: f83a6a0c-ef33-484d-a7a6-e9b162ec575a Age: 1 Pdb path: D:\Dev\Projects\GitHub\PdbRepro\PdbRepro\bin\Debug\Merged\ILRepack-36132-647503\PdbRepro.pdb Stamp: 66BA778F

Match D:\Dev\Projects\GitHub\PdbRepro\PdbRepro\bin\Debug\Merged\PdbRepro.pdb: Native pdb: Microsoft C/C++ MSF 7.00

I guess I'm going to have to go through those 22 assemblies one-by-one and see if I can figure out what's triggering the broken .pdb. Whew.

KirillOsenkov commented 3 months ago

No match is key here - it means that pdb doesn't even match the dll

InteXX commented 3 months ago

it means that pdb doesn't even match the dll

Right. The pre-merge .pdb matches, but the post-merge .pdb doesn't.

My next troubleshooting step is to add the packages one-by-one to my repro project that are currently on the Setup.exe project and try merges for each one. If I can get a repro that'll pinpoint the culprit.

InteXX commented 3 months ago

OK, I found the problem. It's my fault. I broke it.

I'm using a build target to modify the assembly's resources after it was merged (and after the merged .pdb was created). It's only natural that the two don't match.

But here's the problem: I have to do that.

Several of the Microsoft assemblies contain an XML file resource, all with the same name (ILLink.Substitutions.xml), but with different content per assembly.

I don't want to risk a crash in my application when one of these merged assemblies tries to get the XML it needs and can't. So I wrote a small console app that extracts the XML from each assembly, merges it into a single XML document (all the root node elements bear the same name), and then adds it to the single merged Setup.exe after ILRepack has done its work. So I'm in a bit of a Catch-21, as you can see.

Can ILRepack rebuild that .pdb after I've fixed the resources?

KirillOsenkov commented 3 months ago

It means they were already repacked using ILRepack I think?

ILRepack definitely has logic to merge these, so you don't need to have your own custom IL rewriting step: https://github.com/gluck/il-repack/blob/master/ILRepack/Steps/ILLinkFileMergeStep.cs

I'm not super familiar with this but I think you can delete your custom tool.

KirillOsenkov commented 3 months ago

I think you need the new /illink command-line argument to enable this, this feature was added on January 4: https://github.com/gluck/il-repack/commit/986c1dbc761cce86caf1d8d79adfbcb47b5e0e06

InteXX commented 3 months ago

Darn! Somebody found it before me 😉

InteXX commented 3 months ago

Thanks a bunch! 👍

InteXX commented 3 months ago

Oh, one last thing...

Does "Binary was not built with debug information" simply mean that the assembly and its .pdb don't match? Or is it something more complex than that, such as some sort of pointer not being added to the assembly?

InteXX commented 3 months ago

ChatGPT tells us this:

When you see the message "Binary was not built with debug information," it could mean one of the following:

  • PDB Missing or Mismatched: The most common case is that the PDB file is either missing or doesn't match the assembly.
  • No Debugging Information Embedded: The assembly was compiled without embedding any pointers or references to a PDB file. This can happen if the assembly was compiled in a mode where generating debug information was disabled (e.g., a release build without debug symbols).
  • Assembly Was Optimized: In some cases, if the code was heavily optimized during compilation, the debug information might be limited or not useful, even if a PDB exists.

Would you agree?

KirillOsenkov commented 3 months ago

I just built a console app with DebugType none, and the output of pdb on it was:

Reproducible

This means C# used deterministic compilation (modern default), instead of generating a random timestamp.

Then I switched the DebugType to full:

Reproducible
Guid:     f6ac1949-8f79-432d-b5b7-701ffa9c0c4a
Age:      1
Pdb path: C:\temp\net472\obj\Debug\net472\net472.pdb
Stamp:    FCD9332E

Match
C:\temp\net472\bin\Debug\net472\net472.pdb: Native pdb: Microsoft C/C++ MSF 7.00

Now the binary includes a debug directory entry with the information about the pdb: Guid, Age (normally 1), path to the pdb at the moment of compilation and the stamp.

It also finds the .pdb next to the .exe and says that the pdb matches (has the same Guid and age), and prints that it's an old-style (Native) Pdb (C/C++ MSF 7.00 is just the file format).

When the binary was built without debug information it means that the debug directory entry is missing. It's not just the pdb is missing or mismatching.

If you now delete the .dll from disk and re-run, you get:

Reproducible
Guid:     f6ac1949-8f79-432d-b5b7-701ffa9c0c4a
Age:      1
Pdb path: C:\temp\net472\obj\Debug\net472\net472.pdb
Stamp:    FCD9332E

It still has the record but now it can't find the Pdb so it won't attempt to match it or print the pdb format.

If you now switch to DebugType portable, you will get:

Reproducible
Guid:     2a7f7035-3e99-49b3-88f0-845d984358c3
Age:      1
Pdb path: C:\temp\net472\obj\Debug\net472\net472.pdb
Stamp:    D44F3549

Algorithm: SHA256
Checksum: 35707F2A993EB36948F0845D984358C349354F5469DE884D4F6094BDDA138EA2

Match
C:\temp\net472\bin\Debug\net472\net472.pdb: Portable pdb
Guid:     2a7f7035-3e99-49b3-88f0-845d984358c3
Stamp:    49354FD4

For portable Pdbs it is also able to tell you what the Guid and stamp are.

Finally you can switch to DebugType embedded, and you will get:

Reproducible
Guid:     2a7f7035-3e99-49b3-88f0-845d984358c3
Age:      1
Pdb path: net472.pdb
Stamp:    D44F3549

Algorithm: SHA256
Checksum: 35707F2A993EB36948F0845D984358C349354F5469DE884D4F6094BDDA138EA2

Contains embedded pdb
KirillOsenkov commented 3 months ago

So from the ChatGpt output the second bullet is correct.

InteXX commented 3 months ago

I owe you a milkshake.

KirillOsenkov commented 3 months ago

And the fact that when you ran pdb on your assembly it printed nothing (just the file path) means that the rewritten assembly was neither reproducible nor contained a debug directory entry with the pdb info :)