Open NN--- opened 3 years ago
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.
Tagging subscribers to this area: @tommcdon See info in area-owners.md if you want to be subscribed.
Author: | NN--- |
---|---|
Assignees: | - |
Labels: | `area-Diagnostics-coreclr`, `untriaged` |
Milestone: | - |
@hoyosjs @mikem8361
@mmitche fyi
Using dotnet-symbol I can download both the Windows and Portable PDBs for the 5.0.0 System.Runtime.dll.
I'm not sure why symsrv/symchk you are using doesn't find it. Can you send me the exact command line you are using?
Looks like the reason is they are doing IL-only PDB fetching. We don't index those system.runtime.ni.pdb/be121b6a3b997441e06296fca1dbdc361/system.runtime.ni.pdb
exists, but that's the after we changed to R2R.
@mikem8361 I use Visual Studio with Microsoft symbols server.
@mikem8361 the image has two debug directories:
ED702F4D cv 11C 00009BB8 7FB8 Format: RSDS, {BE121B6A-3B99-7441-E062-96FCA1DBDC36}, 1, System.Runtime.ni.pdb
ED702F4D cv 6E 00009CD4 80D4 Format: RSDS, {A11BAEED-4C47-491B-80F6-FB479B20A1E9}, 1, F:\workspace\_work\1\s\artifacts\obj\System.Runtime\net5.0-Release\System.Runtime.pdb
We index the R2R pdb, but not the IL-Only one.
Dotnet-symbol downloads the IL one just fine with A11BAEED-4C47-491B-80F6-FB479B20A1E9.
Looks like I had it wrong.
The IL PDB is indexed as
System.Runtime.pdb/A11BAEED4C47491B80F6FB479B20A1E9ffffffff/System.Runtime.pdb
(portable)System.Runtime.pdb/A11BAEED4C47491B80F6FB479B20A1E91/System.Runtime.pdb
(full-windows)The R2R PDB is indexed as only under the full key:
system.runtime.ni.pdb/be121b6a3b997441e06296fca1dbdc361/system.runtime.ni.pdb
(full-windows. Used only in specific cases)However, the check seems to fail. I can repro this locally. Will continue taking a look.
Looking at @NN--- 's trace, this looks the problem:
SYMSRV: System.Runtime.pdb from https://msdl.microsoft.com/download/symbols: 820 bytes
SYMSRV: PATH: c:\symbols\System.Runtime.pdb\A11BAEED4C47491B80F6FB479B20A1E9ffffffff\System.Runtime.pdb
If I am reading that correctly, that means the portable IL PDB is being downloaded (good), but the symbol server is returning something which is either bogus, or which symsrv.dll doesn't understand, as 820 bytes is clearly not enough for a PDB.
@hoyosjs I think your problem could be different from what @NN--- saw. Are you seeing a case where the wrong PDB is downloaded?
@NN--- Do you still have this problem? If so, could you tell us more what 'c:\symbols\System.Runtime.pdb\A11BAEED4C47491B80F6FB479B20A1E9ffffffff\System.Runtime.pdb' looks like --
The PDB for System.Runtime should be pretty small. I verified the two PDBs do exist and the portable version is indeed (Both symweb and msdl) and that they are 820 bytes. (My local build has one with 696 bytes). It also has the classical portable "BSJB" set of first four bytes. I talked a bit with Chuck, and he confirmed what I thought: the SymReader for portable pdbs is saying things don't match. I'll do some testing on top of this removing VS from the equation.
hoyosjs I think your problem could be different from what @NN--- saw. Are you seeing a case where the wrong PDB is downloaded?
Yeah, mine ended up being actually not getting the metadata for creating the key at all. Ironically the only problematic DLL is the one this is System.Runtime. WinDBG somehow reconstructs the key. I might dig a bit more some other day.
Seems like all the ones that fail are the type forwarding assemblies. Although some stuff doesn't add up - if I fetch the symbols from storage something wrong, which hints at a bug in the uploading process.
Tagging subscribers to 'linkable-framework': @eerhardt, @vitek-karas, @LakshanF See info in area-owners.md if you want to be subscribed.
Author: | NN--- |
---|---|
Assignees: | - |
Labels: | `area-Diagnostics-coreclr`, `area-Meta`, `linkable-framework` |
Milestone: | 6.0.0 |
It looks like this is the linker that is causing this. If I build a facade - say System.Runtime
- on x86 and x64 Windows we get a dll with two issues:
This doesn't happen seem to happen with non-facade assemblies in the shared framework (at least on x64 windows where I verified). Currently there's no major break in the Watson-like scenarios, or dump debugging. However, I still don't fully know what causes the GUID collisions, so if this were ever to happen in non-facade IL assemblies and their PDBs, then we'd have a bigger issue.
So far it looks like for 5.0.0 - 5.0.2 facades for coreclr are indexed under the same guid and all facades for mono builds are under a separate guid if the two runtimes share . (For example System.Runtime forwards to System.Private.CoreLib, so there's one guid that gets used for coreclr and another for mono. On the other hand, System.XML.XPath forwards to System.Private.XML, which is largely shared leading to just using one GUID but three different timestamps).
CC: @tmat @tommcdon
I would think that facades don't run through the linker - there's nothing to do on them anyway. Would be interesting to know why this happens (if it actually does).
As for the GUIDs being the same before and after linking - linker implemented deterministic output, which includes copying the MVID from the input to the output. If we use linker in the deterministic mode, then this behavior is currently by design. But maybe we need to change that behavior.
@vitek-karas Does the linker change anything in the PDB? If not then both timestamp and guid in the CodeView entry should remain the same.
If it does then either 1) The input PDB is never indexed. The linker can then keep both timestamp and guid in the CodeView entry the same as input. 2) The linker needs to update both timestamp and the guid in the CodeView and in the PDB itself (PDB ID) to values based on a hash of the new PDB content (with PDB ID bytes zeroed). Roslyn generates the timestamp and guid by calculating SHA2 hash and using 16 bytes of it for the guid and 4 bytes for the timestamp.
Linker will roundtrip the PDB if it processes an assembly (it has to, if it's making changes to the assembly). The interesting bit here is that running linker on facades doesn't make sense, so we should not do it.
@vitek-karas two quick questions:
Got me a while to make sure I understand what's going on. There are basically two problems here:
PdbChecksum
entry - this is a lack of feature in Mono.Cecil - it has no knowledge about that entry type and can't write it. We will have to implement it if required. I think it already works for Windows PDBs (since it's using the native APIs to do this, which I assume already handle this), but framework uses Portable PDBs and Cecil uses internal implementation of everything. This will be tricky - especially computing the checksum following the rules of computing the has from the entire PDB with ID set to zeroes - the way the implementation works currently is definitely not ideal for such a computation.It seems this should be a problem basically for any assembly which has exactly the same content across architectures.
@vitek-karas - is this something that will be addressed in 6.0?
Hmm - I completely forgot about this. If it's important I can try to find somebody to tackle this.
Not sure if this is the same issue as this or related, but Visual Studio does not find the symbols for System.Runtime.dll
https://referencesource.microsoft.com/symbols: Symbols downloaded from symbol server.
C:\Users\XXXXX\AppData\Local\Temp\SymbolCache\System.Runtime.pdb\76392EF435484AB3882E7D3888EBBEF7ffffffff\System.Runtime.pdb: PDB does not match image.
Click the sections below to see more details
@agocke @vitek-karas - is this something that will be addressed in 6.0?
Tagging subscribers to this area: @vitek-karas, @agocke, @vsadov See info in area-owners.md if you want to be subscribed.
Author: | NN--- |
---|---|
Assignees: | - |
Labels: | `area-AssemblyLoader-coreclr`, `linkable-framework` |
Milestone: | 6.0.0 |
Tagging subscribers to this area: @dotnet/runtime-infrastructure See info in area-owners.md if you want to be subscribed.
Author: | NN--- |
---|---|
Assignees: | - |
Labels: | `area-AssemblyLoader-coreclr`, `area-Infrastructure`, `linkable-framework` |
Milestone: | 6.0.0 |
Tagging subscribers to this area: @Anipik, @safern, @ViktorHofer See info in area-owners.md if you want to be subscribed.
Author: | NN--- |
---|---|
Assignees: | vitek-karas |
Labels: | `area-Infrastructure-libraries`, `linkable-framework` |
Milestone: | 6.0.0 |
@vitek-karas should we keep this in 6.0 or move it out? Asking as you just recently filed an issue for it with the milestone set to 6.0: https://github.com/mono/linker/issues/2203.
I spent some more time looking into this.
While the problem exists in all versions of the linker/.NET, this seems to be primarily .NET 5 problem. In .NET 5 the managed assemblies were built separately for x86 and x64. So for example the .NET 5 System.Runtime.dll
for x86 is a 32bit PE image, while the one for x64 is a 64bit PE image. Otherwise the two file are identical, they only differ in the architecture bit in the PE header. (This is were the bug in the linker shows up).
On .NET 6+ the x86 and x64 builds of System.Runtime.dll
are both 32bit PE images. Interestingly enough we seem to ship different binaries for x86 and x64, even though both are 32bit PE images. There are slight differences in the binary content of these files for the two architectures.
Also the symbol indexing seems to work for 6.0, debugging a simple app in VS correctly loads symbols for all assemblies (including System.Runtime.dll
).
Hi @vitek-karas. Did you identify any problems that we should fix? It sounds like you were able to confirm that 6.0 binaries in the runtime packs matched what's on the symbol server.
There are definitely two separate issues in the linker/cecil.
It doesn't correctly consider all of the assembly when computing a hash (from which it derives the deterministic MVID) - and I was able to repro the problem on a simple example easily (assemblies which only differ by architecture in the PE header will get the same MVID after trimming -> bad).
And second it doesn't write the PdbChecksum entry into the output at all. This is a bit less important, but still would be really good to fix.
What I mentioned above is that it seems that these has much lower impact on .NET 6 than on .NET 5, but they still exist and should be fixed.
On .NET 6+ the x86 and x64 builds of System.Runtime.dll are both 32bit PE images. Interestingly enough we seem to ship different binaries for x86 and x64, even though both are 32bit PE images. There are slight differences in the binary content of these files for the two architectures.
We should fix this too since Xamarin Android uses the MVID to deduplicate architecture-independent assemblies so we definitely want to make sure both builds are identical.
Just update on what I found after digging into this for real.
This is the issue mentioned above. Cecil doesn't include the PE file headers into the MVID hash computation, it only uses data from the IL Metadata tables and blobs. This leads to facades having collisions on MVID as the IL side of the dll is identical. This should be fixed by hashing the entire dll with MVID all zeroed and then just patching the MVID from the computed hash as the last thing to be done on the dll (well except where there is strong name signing, but I don't think we use this in the framework).
Cecil uses the MVID in the PDB as the PDB ID. This is actually very problematic for the fix for Issue 1, especially if there are embedded PDBs. On its own this is not something which should necessarily break anything. But it makes the correct implementation basically impossible, so this will need to be changed.
This causes NuGet to raise warnings (potentially) and should be fixed. This is only applicable to portable PDBs (Cecil also supports Windows Native PDBs and also MDB - mono symbols format), so the changes are not as big, unfortunately this will require changing the order in which things are done in Cecil -> potentially relatively risky change.
The fix for this has been merged into mono/cecil and also into dotnet/linker/main. So this should be fixed in .NET 7 for sure. We're considering bringing the fix to .NET 6 servicing as well (the linker/SDK side of things). https://github.com/dotnet/linker/issues/2203
@vitek-karas should we move this to 8.0?
I'm moving it to 6.0 actually, because this is now only tracking whether we take a backport to 6.0.
Tagging subscribers to this area: @vitek-karas, @agocke, @vsadov See info in area-owners.md if you want to be subscribed.
Author: | NN--- |
---|---|
Assignees: | vitek-karas |
Labels: | `area-Host`, `linkable-framework` |
Milestone: | 6.0.x |
Tagging subscribers to this area: @agocke, @sbomer, @vitek-karas See info in area-owners.md if you want to be subscribed.
Author: | NN--- |
---|---|
Assignees: | vitek-karas |
Labels: | `linkable-framework`, `area-Tools-ILLink` |
Milestone: | 6.0.x |
Description
There are no symbols for System.Runtime.dll on Microsoft Symbols Server: