microsoft / Windows-Dev-Performance

A repo for developers on Windows to file issues that impede their productivity, efficiency, and efficacy
MIT License
434 stars 20 forks source link

Symbol server PE files are being overwritten with different versions #102

Closed randomascii closed 8 months ago

randomascii commented 2 years ago

Windows Build Number

Win32NT 10.0.22000.0 Microsoft Windows NT 10.0.22000.0

Processor Architecture

AMD64

Memory

16 GB

Storage Type, free / capacity

SSD 160GB/350GB

Relevant apps installed

windbg, Chrome

Traces collected via Feedback Hub

Sorry, no feedback hub traces

Isssue description

Different versions of kernel32.dll are being published to Microsoft's symbol server with the same timestamp/image-size/filename triplet, with each new version overwriting the previous version. This means that when examining a crash dump from 10.0.22000.376 Windows 11 the results of "lmv m kernel32" will (as of this writing) report that the DLL is both version .318 and version .376. Actually, the exact results depend on the contents of your local symbol server cache and the second number could be .318, .347, or .376, or possibly other numbers.

Steps to reproduce

Use chrome://crash or similar to create a minidump without heap. Load it on various machines and run "lmv m kernel32". You can also use sigcheck to compare the version numbers of kernel32.dll on the machine and in the symbol server. Use dumpbin /headers to verify that the timestamp and image size match while the version number varies.

Expected Behavior

It is crucial that the symbol server triplet uniquely identify a particular file. Identifying a file which is "the same except for the version number" is not actually corret.

Actual Behavior

Microsoft's deterministic build system seems to be publishing files that are not identical (version number has changed) but overwrite each other in the symbol server. Nope. That's not good.

randomascii commented 2 years ago

I've put the relevant files (three copies of kernel32.dll and a crash dump that will happily refer to any one of them) in this shared Google drive folder: https://drive.google.com/drive/folders/1EUGs6iNqNXvWb98KMYcSBbXzIXUWGoWw

zooba commented 2 years ago

Thanks, I'm seeing if I can chase down someone who may know what's going on here. This seems bad, though there may be a legitimate reason for it.

I assume you've got the latest windbg available?

randomascii commented 2 years ago

Yeah, I tried this with the latest windbg, but it won't matter because it's the timestamp/image-size/name triplet reuse that is the problem. I can reproduce the issue using RetrieveSymbols.exe (https://github.com/google/UIforETW/blob/main/bin/RetrieveSymbols.exe) but also I can see that I have three different files with the same timestamp/image-size/name triplet, and once that is true there is nothing that windbg or any other symbol-server client can do.

randomascii commented 2 years ago

This has been reported as a problem in the past, such as here, where it causes (understandably) developer confusion:

https://github.com/m417z/winbindex/issues/139

AvriMSFT commented 2 years ago

Hey @zooba! Any updates here?

rhuijben commented 1 year ago

Any news half a year later? @AvriMSFT @zooba?

zooba commented 1 year ago

The Windows build team acknowledged that they're doing it, but I don't know how much further it got than that. If it's still happening, probably didn't go anywhere. I'll see if the people I was in touch with are still on it.

AdamBraden commented 8 months ago

Closing the loop here - this was resolved over the summer. Let us know if something is not working right.

randomascii commented 8 months ago

By "resolved" does this mean that files will not be overwritten anymore? As in, retrieving PE files from the symbol server should now retrieve the correct version? That would be great.

m417z commented 8 months ago

It would indeed be interesting to see how it was resolved. I still see the same issue in recent files. For example: wiatrace.dll: https://winbindex.m417z.com/?file=wiatrace.dll Versions 10.0.19041.3570 and 10.0.19041.3636, x64 architecture. The hashes are:

Both files have the same timestamp and image size, so the link for both files is: https://msdl.microsoft.com/download/symbols/wiatrace.dll/00EA0FC7a000/wiatrace.dll

The files were published in October 10 and 26, respectively, so after "this summer", I assume.

zooba commented 7 months ago

It's been resolved, but only in the current branch (at the time, a few months ago). So build 19041 (Windows 10 April 2020 update) is not going to have the correction, but newer builds should be fine. I understand the issue is inherently part of the build process, rather than the server implementation.

m417z commented 4 months ago

Here's another example: wiatrace.dll: https://winbindex.m417z.com/?arch=insider&file=wiatrace.dll Versions 10.0.23619.1000 and 10.0.23620.1000, x64 architecture. The hashes are:

Both files have the same timestamp and image size, so the link for both files is: https://msdl.microsoft.com/download/symbols/wiatrace.dll/C5004E2Ba000/wiatrace.dll

The files were published in January 18 and 25, 2024, respectively. How new should they be to have the fix? I'm also curious how the fix was implemented, any details about that?

zooba commented 4 months ago

I don't have the exact number, but I believe the fix is somewhere during the 10.0.25???.? version numbers, so anything before that is likely to keep showing it (as the fix is not in those branches), while newer builds should have updated timestamps to match the changed version number.