dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.79k stars 3.19k forks source link

Memory growth on Linux with "journal_mode=memory" #34695

Closed IkerPSR closed 2 months ago

IkerPSR commented 2 months ago

Hello, I need help with a memory growth problem in Microsoft.Data.Sqlite 8.0.8 on Linux (reproduced on linux-x64 and linux-arm64), and it doesn't happen on Windows.

Tested on a C# .NET8 project, the memory growth occurs when "PRAGMA journal_mode=memory" is used: in this mode, when executing commands (ExecuteNonQuery) that affect thousands of lines of a table (for example a DELETE), it is observed that the memory of the thread from which the command was executed, grows and is not released at the end of the command execution. Memory is permanently reserved for each thread that executes the command, so the total memory used by the process grows for each new thread launched. This only happens on Linux. I have tested the same code on Windows and the memory is freed correctly after each command is executed.

I attach a sample project reproducing the problem: Sqlite_journal_mem.zip

On Windows, the memory is released after command execution image

On Linux the memory grows for each thread image

Am I missing something?

Microsoft.Data.Sqlite version: 8.0.0 Target framework: .NET 8 Operating system: Reproduced on

Thanks in advance

roji commented 2 months ago

@IkerPSR do you ever see an actual OutOfMemoryException? If not, this is probably just the normal behavior in a garbage collection language - the GC only kicks in and reclaims memory when it needs to, or based on various parameters. You may want to look into switching from GC server to workstation mode.

IkerPSR commented 2 months ago

@roji In linux, the memory grows and grows until it takes up all the system memory and the SO kills the process, but no OOM exception occurs, which indicates a serious problem (only in Linux) Currently the application is running in GC workstation mode.

The expected behavior would be for the garbage collector to act at some point, but it doesn't. I don't know if the problem is in the dotnet GC itself for Linux, or a leak in the Microsoft.Data.Sqlite library, but the operation on Linux is not normal. I want to highlight that the example works perfectly on Windows.

Also, in the attached example, I call GC.Collect(); GC.WaitForPendingFinalizers();

but that is not successful either. I've tried absolutely everything, I've checked the GC configuration, but nothing works and I'm out of ideas.

Could you try the attached example code on Linux/Windows and give me your impressions?

Thanks in advance

roji commented 2 months ago

@IkerPSR OK, thanks for confirming. It's quiter unlikely that Microsoft.Data.Sqlite itself could produce a memory leak that's only on Linux - this could be something lower-level...

@cincuranet interested in taking a look?

cincuranet commented 2 months ago

@roji I'll have a 👀.

IkerPSR commented 2 months ago

Hello,

I have been investigating with dotnet debugging tools and I see something very strange with the memory after executing DELETE several times The working set displayed by "dotnet counters monitor" is 372MB, but the dump ("dotnet dump collect") only shows 649,517 bytes in "dumpheap -stat". Where is the memory going to? I´m not an expert on memory debugging tools, but it seems very strange...

image

IkerPSR commented 2 months ago

UPDATE: I noticed that when doing the dump, the working set increases, so I ran the test again: First I monitored the counters (before doing the dump) image

and then I made the dump image

Well, the numbers changed, but the problem is still the same: The working set is much bigger than the memory reported by the dump...

IkerPSR commented 2 months ago

Hello again,

Could the problem be related to glibc-malloc? and with these environment variables?

After setting the following values MALLOC_ARENA_MAX=2 MALLOC_TRIM_THRESHOLD=40960 The memory remains estable over 200MB.

I have invented the values ​​and I don't know what they really mean. @cincuranet Does this make any sense to you?

cincuranet commented 2 months ago

This looks more like an unmanaged/native memory. Anyway. I would first, instead of dumpheap, do eeheap. Given it's not happening on Windows, we can almost rule out provider itself. Can you maybe try running it on .NET 9 (currently RC1)? Or using newer/newest version of Microsoft.Data.Sqlite, because that will bring newer SQLite (just in case).

IkerPSR commented 2 months ago

@cincuranet Here it is the captured eeheap, and the real working set is about 350MB... image

I've tested on latest version of Microsoft.Data.Sqlite v9.0.0-rc.1.24451.1 (sqlite version 3.45.1) and the problem persists in this version

cincuranet commented 2 months ago

This gives me the confidence that this is not in managed world, but rather native memory. You can try, again "just in case", run on .NET 9 runtime.

Either way, there's not much we can do here in terms of Microsoft.Data.Sqlite. What is left is to figure out whether it is a runtime thing or SQLite thing (or both). Unfortunately we here don't have capacity to do the investigation for you. Feel free to tag me though in whatever place you continue this.