Nexus-Mods / NexusMods.App

Home of the development of the Nexus Mods App
https://nexus-mods.github.io/NexusMods.App/
GNU General Public License v3.0
865 stars 43 forks source link

Insane heap allocations when downloading #1927

Closed erri120 closed 2 weeks ago

erri120 commented 3 weeks ago

Running the project through Rider in Debug mode, with the Debugger attached, will slow performance down drastically. When downloading, the entire UI is frozen if debugger is attached.

This isn't related to exceptions being swallowed somewhere, I enabled every exception to trigger a breakpoint, and we only have one looping excepting in the CliServer by design.

It'd be interesting to know if debugging with Rider on Windows, and Visual Studio on Windows also have massive performance penalties, or if this is only a Rider on Linux issue. If this issue appears on any configuration of platform and IDE, then we have some issues.

Sewer56 commented 3 weeks ago

[Copied and adapted from Slack]

I tried starting debugging from VSCode. Same issue. May be upstream even beyond Rider.

This is strange. While I don't know the exact details of what's being used under the hood, I would expect Rider and VSCode to use somewhat different debuggers under the hood.

VSCode uses the proprietary vsdbg that's only licensed to use in VS, VSCode, VS Mac. I imagine Rider is using it's own proprietary debugger. There's also Samsung's netcoredbg, though I'm not 100% sure if it happens there too.

The fact it slows down to a crawl on at least 2 debuggers is really worrying though.

Edit: https://open-vsx.org/extension/muhammad-sammy/csharp

This extension should get you the samsung debugger in vscode.

halgari commented 3 weeks ago

Ran some profiling, and yes it slows down during downloads, but not by an incredible amount on my machine. I did notice the heap filling up, lots of allocations and GCs during the download eventually resulting in about 2100MB of memory usage (up from 300) after just a few seconds of the download. I attached a memory profiler and didn't see that level of memory usage with the profiler attached, but the downloader we're using is not optimized for memory usage at a..

for the 6GB file I downloaded I saw:

Digging into the downloader and setting the block size to something sane (like 16MB) may help here, but we might also want to make a PR to submit to them that cleans this stuff up.

Sewer56 commented 3 weeks ago

Performance issues with the downloader are a separate thing. It should probably have an issue of its own, the downloads max at 13.5MB/s on my machine when I expect 60-100MB/s.

What we're experiencing here though is the App pretty much completely locking up when the debugger is attached on Linux. We're talking around 0.5FPS. Happenes with both Rider Debugger and vsdbg.

When only the profiler is attached, the App does not lock up.

erri120 commented 3 weeks ago
* 2.8GB of CancellationTokenSource allocated

That's very likely in the CliServer that creates a new source every second. That code is whack.

halgari commented 3 weeks ago

No....the CLIServer cereates a new source once per second that 6GB took roughly 30 seconds to download....we're talking about over 100million cancellation token source creations, not 15

The downloader code however, bottoms out at this:

https://github.com/bezzad/Downloader/blob/f0b8ebbc09ff11ffee07b7c46eb342fd3e8b9549/src/Downloader/ChunkDownloader.cs#L138-L193

            using var _ = cancelToken.Register(stream.Close);
            while (readSize > 0 && Chunk.CanWrite)
            {
                cancelToken.ThrowIfCancellationRequested();
                await pauseToken.WaitWhilePausedAsync().ConfigureAwait(false);
                byte[] buffer = new byte[_configuration.BufferBlockSize];
                using var innerCts = CancellationTokenSource.CreateLinkedTokenSource(cancelToken);
                innerToken = innerCts.Token;
                innerCts.CancelAfter(Chunk.Timeout);
                using (innerToken.Value.Register(stream.Close))
                {
                    // if innerToken timeout occurs, close the stream just during the reading stream
                    readSize = await stream.ReadAsync(buffer, 0, buffer.Length, innerToken.Value).ConfigureAwait(false);
                    _logger?.LogDebug($"Read {readSize}bytes of the chunk {Chunk.Id} stream");
                }

                readSize = (int)Math.Min(Chunk.EmptyLength, readSize);
                if (readSize > 0)
                {
                    await _storage.WriteAsync(Chunk.Start + Chunk.Position - _configuration.RangeLow, buffer, readSize).ConfigureAwait(false);
                    _logger?.LogDebug($"Write {readSize}bytes in the chunk {Chunk.Id}");
                    Chunk.Position += readSize;
                    _logger?.LogDebug($"The chunk {Chunk.Id} current position is: {Chunk.Position} of {Chunk.Length}");

                    OnDownloadProgressChanged(new DownloadProgressChangedEventArgs(Chunk.Id) {
                        TotalBytesToReceive = Chunk.Length,
                        ReceivedBytesSize = Chunk.Position,
                        ProgressedByteSize = readSize,
                        ReceivedBytes = buffer.Take(readSize).ToArray()
                    });
                }
            }
halgari commented 3 weeks ago

That's a memory allocation nightmare, considering the chunk size is 8KB, after I upped it from the default of 1KB

Sewer56 commented 3 weeks ago

Misc Note: That code should be borrowing memory from an ArrayPool.

erri120 commented 3 weeks ago

That's fun

halgari commented 3 weeks ago

also...get a load of the last line

buffer.Take(readSize).ToArray()