icsharpcode / ILSpy

.NET Decompiler with support for PDB generation, ReadyToRun, Metadata (&more) - cross-platform!
21.65k stars 3.36k forks source link

File reading breaks filesystem cache on macOS #2416

Closed TellowKrinkle closed 3 years ago

TellowKrinkle commented 3 years ago

Steps to reproduce

  1. Attempt to decompile Assembly-CSharp.dll from this using dotnet ICSharpCode.Decompiler.Console/bin/Release/netcoreapp2.1/ilspycmd.dll Managed/Assembly-CSharp.dll -p -o somewhere
  2. Watch disk usage in Activity Monitor

Expected result:

ilspy reads at most somewhere around the size of all dlls in the zip (14MB or so)

Actual result:

ilspy maxes out an SSD at 400MB/s read for 10-20 seconds. If it's on slower media (e.g. an SD card), decompilation takes multiple minutes. Final total is around 6.5GB read An Instruments system trace indicates all the threads constantly calling mmap on a file with a size equal to that of Assembly-CSharp.dll, msync, and munmap, followed by large numbers of file-backed page-ins:

image

(Backtrace is broken because Instruments and JIT don't get along very well)

Instruments isn't showing for some reason, but dtruss indicates that the set of calls have these arguments:

 7476/0x2c984:  mmap(0x0, 0x117254, 0x1, 0x40001, 0x24, 0x0)         = 0x10AC5A000 0
 7476/0x2c984:  msync(0x10AC5A000, 0x116E54, 0x12)       = 0 0
 7476/0x2c984:  munmap(0x10AC5A000, 0x117254)        = 0 0

Where 0x12 is MS_SYNC | MS_INVALIDATE

My guess is that Assembly-CSharp.dll is mapped somewhere, gets mapped a second time, then gets msync called, invalidating all FS caches, resulting in later reads to the original mapping requiring a reread from disk. I can confirm that the following code exhibits similarly high disk usage:

int main(int argc, const char * argv[]) {
    struct stat s;
    int fd = open(argv[1], O_RDONLY);
    fstat(fd, &s);
    void* p1 = mmap(NULL, s.st_size, PROT_READ, MAP_SHARED | MAP_FILE, fd, 0);
    void* p2 = malloc(s.st_size);
    while (1) {
        void* ptr = mmap(NULL, s.st_size, PROT_READ, MAP_SHARED | MAP_FILE, fd, 0);
        msync(ptr, s.st_size, MS_SYNC | MS_INVALIDATE);
        munmap(ptr, s.st_size);
        memcpy(p2, p1, s.st_size);
    }
    return 0;
}

Details

siegfriedpammer commented 3 years ago

Is it possible to see a trace of the managed stack that triggers the mmap/msync/munmap and memcpy calls?

We could change the way System.Reflection.Metadata loads binaries:

https://github.com/icsharpcode/ILSpy/blob/46e9f633ca5e9bc06feb21421439b00a127ee440/ICSharpCode.Decompiler.Console/IlspyCmdProgram.cs#L148-L158

Is there any improvement if you specify streamOptions: PEStreamOptions.PrefetchEntireImage in the PEFile constructor call? Also, try adding streamOptions: PEStreamOptions.PrefetchMetadata to the UniversalAssemblyResolver constructor call.

If that does not improve the situation, it would be very helpful to find out, which managed function calls result in the system call pattern given in your C code.

There might be potential for optimization in the way .NET 5.0 handles memory mapped files on Unix platforms, might be worth opening an issue in the dotnet/runtime repository.

TellowKrinkle commented 3 years ago

Is it possible to see a trace of the managed stack that triggers the mmap/msync/munmap and memcpy calls?

How does one get a managed stack trace? Is it possible to attach lldb, break on msync, and call/do something to get one?

siegfriedpammer commented 3 years ago

a quick google search has told me that it is possible to use lldb, however, I have never tried it myself so I cannot give any specific hints... For this to work out best, you should clone the ILSpy repostory and compile ilspycmd yourself using Frontends.sln as described in the README. If you face any problems while doing that, feel free to ask.

siegfriedpammer commented 3 years ago

I just confirmed using strace that this problem is also present on Linux.

dgrunwald commented 3 years ago

This seems to be a bug in System.Reflection.Metadata: https://github.com/dotnet/runtime/blob/01b7e73cd378145264a7cb7a09365b41ed42b240/src/libraries/System.Reflection.Metadata/src/System/Reflection/PortableExecutable/PEReader.cs#L377 GetPESectionBlock always mmaps a new MemoryBlock on every call. On the first call, it cached that block and returns it. On every call after the first, it still creates a new MemoryBlock but immediately disposes it without using it (it instead returns the cached memory block). Basically the cache lacks the "cache hit" code path, going through the "cache miss" code path every single time.

christophwille commented 3 years ago

Tracked at external location dotnet/runtime, closing here.