Closed TellowKrinkle closed 3 years ago
Is it possible to see a trace of the managed stack that triggers the mmap/msync/munmap and memcpy calls?
We could change the way System.Reflection.Metadata loads binaries:
Is there any improvement if you specify streamOptions: PEStreamOptions.PrefetchEntireImage
in the PEFile
constructor call?
Also, try adding streamOptions: PEStreamOptions.PrefetchMetadata
to the UniversalAssemblyResolver
constructor call.
If that does not improve the situation, it would be very helpful to find out, which managed function calls result in the system call pattern given in your C code.
There might be potential for optimization in the way .NET 5.0 handles memory mapped files on Unix platforms, might be worth opening an issue in the dotnet/runtime repository.
Is it possible to see a trace of the managed stack that triggers the mmap/msync/munmap and memcpy calls?
How does one get a managed stack trace? Is it possible to attach lldb, break on msync, and call/do something to get one?
a quick google search has told me that it is possible to use lldb, however, I have never tried it myself so I cannot give any specific hints... For this to work out best, you should clone the ILSpy repostory and compile ilspycmd yourself using Frontends.sln as described in the README. If you face any problems while doing that, feel free to ask.
I just confirmed using strace that this problem is also present on Linux.
This seems to be a bug in System.Reflection.Metadata: https://github.com/dotnet/runtime/blob/01b7e73cd378145264a7cb7a09365b41ed42b240/src/libraries/System.Reflection.Metadata/src/System/Reflection/PortableExecutable/PEReader.cs#L377
GetPESectionBlock
always mmaps a new MemoryBlock on every call. On the first call, it cached that block and returns it. On every call after the first, it still creates a new MemoryBlock but immediately disposes it without using it (it instead returns the cached memory block).
Basically the cache lacks the "cache hit" code path, going through the "cache miss" code path every single time.
Tracked at external location dotnet/runtime, closing here.
Steps to reproduce
dotnet ICSharpCode.Decompiler.Console/bin/Release/netcoreapp2.1/ilspycmd.dll Managed/Assembly-CSharp.dll -p -o somewhere
Expected result:
ilspy reads at most somewhere around the size of all dlls in the zip (14MB or so)
Actual result:
ilspy maxes out an SSD at 400MB/s read for 10-20 seconds. If it's on slower media (e.g. an SD card), decompilation takes multiple minutes. Final total is around 6.5GB read An Instruments system trace indicates all the threads constantly calling mmap on a file with a size equal to that of
Assembly-CSharp.dll
, msync, and munmap, followed by large numbers of file-backed page-ins:(Backtrace is broken because Instruments and JIT don't get along very well)
Instruments isn't showing for some reason, but dtruss indicates that the set of calls have these arguments:
Where
0x12
isMS_SYNC | MS_INVALIDATE
My guess is that
Assembly-CSharp.dll
is mapped somewhere, gets mapped a second time, then getsmsync
called, invalidating all FS caches, resulting in later reads to the original mapping requiring a reread from disk. I can confirm that the following code exhibits similarly high disk usage:Details
ICSharpCode.Decompiler.Console