gimelfarb / ProductionStackTrace

Without deploying PDBs, generate a .NET exception stack trace that can be processed to retrieve source file and line number info
Other
79 stars 8 forks source link

Incorrect symbols being loaded #2

Closed sawilde closed 7 years ago

sawilde commented 9 years ago

Whilst testing the nuget package against symbols we had pushed via symstore to a local drive we noticed that when we called Convert-ProductionStackTrace on a trace for where there should be no symbols - it located a line for each offset - it seemed to be using the incorrect symbols (possibly latest?).

Reviewing the code we note that SymSetOptions was only being called with SYMOPT_DEBUG. We were thinking that SYMOPT_EXACT_SYMBOLS would also be useful to ensure that we don't get false positives.

What do you think?

gimelfarb commented 9 years ago

Hi @sawilde,

Firstly thanks for letting me know about this. I am interested in finding out what happened.

I don't know if SYMOPT_EXACT_SYMBOLS is the culprit here, but to be honest I wouldn't know what exactly it does under the covers. Documentation just says "slightest discrepancy between the symbol files and the symbol handler's expectations", whatever that means.

The thing is that we are looking for a PDB file by a GUID and Age parameter. These change every time the project is re-compiled. So only the PDB generated at the time of compilation should have been picked up. That's the theory anyway.

I am interested in seeing if these match. Can you:

Now the 2nd is more tricky, as there is no easy way to get to it with everyday tools. You need a hex editor like Frhed. It is also not stored in the same location. And bytes are in different order.

For example, a GUID {AA562C79-0D89-4016-B3D9-ECDB8CA457DD}, would be stored as bytes: 79 2C 56 AA 89 0D 16 40 B3 D9 EC DB 8C A4 57 DD.

jaspermondie commented 9 years ago

Hi @gimelfarb,

Working with @sawilde here on spiking your tool for our solution.

I've done the following:

Based on these findings, did it matter if the GUIDs from the output of dumpbin matter at all since the Convert-ProductionStackTrace only uses the output of ExceptionReporting.GetExceptionReport() as its source?

It's also interesting to note that even if the GUID from ExceptionReporting.GetExceptionReport() does not match the PDB (based on @sawilde's post), it still 'successfully' converts using Convert-ProductionStackTrace.

Will do some more digging and will let you know if we find anything.

gimelfarb commented 9 years ago

Hi @jaspermondie ,

Thanks for helping look into this issue! It sounds very peculiar, and I am keen to learn what's going on. I am curious as to why there are several GUIDs coming out of dumpbin, as I have only seen one whenever I looked at binaries. Any chance you can post that part of the output?

GetExceptionReport() reads the GUID from the assembly's DLL, as it's mapped in memory, using HINSTANCE handle returned by Marshal.GetHINSTANCE(assembly.ManifestModule). It traverses PE headers to find the debug info, and then extracts GUID from it. So whatever it returns should match what the dumpbin /headers would give you, if you ran it on the same DLL.

So in your case you said you did find it in a PDB, right? In which case it is understandable that it was found during Convert-ProductionStackTrace. Or do you have a case where the loaded PDB does not match the GUID?

I would very much like to get the data somehow, to take a look.

jaspermondie commented 9 years ago

Hi @gimelfarb,

Here is a link of the spike solution.

The solution has been modified 4 times with the PDBs stored for each change.

The following are the explanation of the folders:

gimelfarb commented 9 years ago

Just an update - I did manage to reproduce the behavior just now. Very interesting. I'll dig in later today.

gimelfarb commented 9 years ago

@jaspermondie,

Figured it out. What happens is that by default it finds and loads "dbghelp.dll" from Windows\system32 folder, which is a cut-down and generally not useful implementation of DbgHelp library. It is that implementation that goes into a symbol server folder, and doesn't really recognize it is a "symbol folder", and just enumerates all folders until it finds a PDB matching by name. Just loads the first one it finds. It doesn't have any support for symsrv either.

Now, when I forced it load to 'dbghelp.dll' from Windows SDK, then it happily recognized the symbol server folder as a proper 'symbol server' structure (via pingme.txt file), and loaded correct PDB. It also refused to load a PDB when symbols server didn't contain a matching one by GUID/Age.

Anyway, the fix for me to implement is to try and load the "right" dbghelp.dll, by detecting if a good one is installed on the system. I'll need to do some research how to find that through the registry.

@jaspermondie, @sawilde - As a workaround right now, you can find that dbghelp.dll (on my PC it is at: C:\Program Files (x86)\Windows Kits\8.1\Debuggers\x64\dbghelp.dll) and place it in the same folder from which ProductionStackTrace.Analyzer.Console.exe runs from. This will bring the correct behavior of matching the PDB by GUID/Age.

jaspermondie commented 9 years ago

Appreciate the help, @gimelfarb

jaspermondie commented 9 years ago

@gimelfarb, how do I know that the correct dbghelp.dll has been loaded? I've played around with the spike solution with the exception outputs and removed the pdbs and it looks like it is still spitting out a stack trace for the wrong version.

This time, I am using PS C:\Projects\Spike-ProductionStackTrace> .\ProductionStackTrace.Analyze.Console.exe -s ".\SymbolServer" and pasting in the ExceptionReports into the console.

gimelfarb commented 9 years ago

@jaspermondie - You can check via Process Explorer (View > Lower Pane Details > DLLs). Try x86 version of dbghelp.dll if it is not being loaded, maybe it preferred 32-bit JIT.

jaspermondie commented 9 years ago

@gimelfarb - Thanks for that. I was able to verify that the correct dbghelp.dll is being loaded. However, I'm still getting the same behavior as before. I've uploaded the working folder here for you to verify.

gimelfarb commented 9 years ago

@jaspermondie - Apologies it took so long to reply, got sidetracked with work.

I got your folder, and the issue is that dbghelp.dll is not enough on its own, it actually delegates to symsrv.dll. So if it is loaded from SDK folder, then it is fine, because symsrv.dll will be found as well. But in your case, it was loaded standalone in the sample folder, and couldnt find its companion symsrv.dll.

I fixed it by copying symsrv.dll into the folder as well. After that only the last exception report is translated, the other 3 are not, because the PDB GUIDs don't match.

Hope this helps, even if a little late.

jaspermondie commented 9 years ago

@gimelfarb No problem. Thanks for the reply!

I've verified that copying the symsrv.dll file over works. Thanks for helping us with this issue! Much appreciated.