mattn / go-sqlite3

sqlite3 driver for go using database/sql
http://mattn.github.io/go-sqlite3
MIT License
7.84k stars 1.09k forks source link

memory leak - with 2 in-memory databases in parallel running on linux #1005

Open sthielo opened 2 years ago

sthielo commented 2 years ago

please see https://github.com/sthielo/go-sqlite-memleak, where we stripped our code eventually terminating with an OutOfMemory exception using sqlite in-mem to produce what looks like a memory leak.

README gives some more explanations and observations, but one observation I would like to take upfront here: we only observe that problem on lilnux and not on windows.

We have no clue on the problems origin. Any help very much appreciated. Our code? Go-sqlite3 code? sqlite code?

rittneje commented 2 years ago

While likely not related to any memory leak issue, I will warn you that you should not be using temp files like that. For an in-memory database, the "filename" is not actually backed by the filesystem, so opening a temp file first is pointless. And for a normal database, you actually run the risk of database corruption because of a bug in POSIX advisory locks. In this case, you should instead create the temp file, close it, and then open the database.

Also, please note that shared cache mode is not recommended because it can cause locking-related issues that are tricky to deal with correctly in Go. If your intent is to be able to open multiple connections to the same in-memory database, use the memdb vfs instead.

For your temporary snapshot database, can you test with just "file::memory:" instead of your current connection string? Also, with your workaround (using an actual file), can you tell if your application is leaking open file descriptors?

sthielo commented 2 years ago

thx @rittneje for pointing out some issues - "fixed" them. As you assumed, they had no impact on the mem usage behavior currently analyzed.

Added filehandle stats => no leaking filehandles (15/16 handles used on my machine - opening and closing one filehandle per iteration for dump file)

use :memory:in connection string of snapshot db => no change in increasing mem usage behavior, so it seems to be even worse in terms of total mem (rss) used after 20 iterations. I only ran it twice with that option. Those runs ended with 5-7GB of RAM used after 20 iterations vs. the usual 3-5GB I observed the many runs with the original connection string. But these numbers vary ... another observation I do not really understand.

sthielo commented 2 years ago

c++ port (https://github.com/sthielo/sqlite-behavior-test) does NOT show same memory misbehavior. SO, EITHER the port is not "good" enough - good chance given my rusty c++ skills OR the misbehavior originates in GO code (project or go-sqlite3 or dump)

sthielo commented 2 years ago

I conclude - if all my observations hold - , that it has to do something with the code in backup.go, because

... but I have not the slightest clue how that could be.

rittneje commented 2 years ago

@sthielo Have you tried running pprof? If there is a leak on the Go side (as opposed to C) that may have some clues. https://pkg.go.dev/net/http/pprof

Also, can you test with just "file::memory:" (no query string)?

sthielo commented 2 years ago

@rittneje Thanks for the hint, but yes we did. We did monitor GO mem stats and also used pprof on a copy of this project very early when we encountered this issue (about 5months ago when running into that issue with a new service running in our production), but they do NOT show any memory leak while measuring the OS process (using ps) does. That lead us already in that early stages to the conclusion that it is not GO managed memory that is leaking. That in turn made us suspect it to be a sqlite issue (linked as native lib with own mem management), which is the reason for the port to c++ to proof so, ... which I failed ... honestly leaving me pretty clueless what to do next to trace down that issue.

As a result I revisited the GO code, but I see no clue what could be wrong or what I could test more. So, the second focus point is to verify the c++ port or rather to find the difference between the GO project and its c++ port. One thing I'm currently struggling with is to read out the sqlite-config and the sqlite-db-config to validate the equivalence of the 2 projects. As this GO-adapter (mattn/go-sqlite3) is setting a fair amount of default values upon initializing the db and opening connections, which might differ of what happens when directly accessing the c++-API

So I'm grateful for any hint about what to test furthermore. ... or helps me with the current task of reading out sqlite config ... other than setting them all individually to make sure I have the same config in both setups.

mattn commented 2 years ago

BTW, I'm not sure but sqlite3dump have some problems that err can be possibly return with nil since it might be shadowed.

https://github.com/schollz/sqlite3dump/blob/master/dump.go

sthielo commented 2 years ago

I removed external go dependencies gin and dump by replacing them with primitive own implementations (see https://github.com/sthielo/go-sqlite-memleak). And I still observe memory leaks when running make test under linux as the following output of that test shows, almost each iteration increases the process memory (rss) observed from OS perspective.

    0: &{mem:584356 fh:13}
    1: &{mem:866188 fh:13}
    2: &{mem:1148684 fh:13}
    3: &{mem:1149284 fh:13}
    4: &{mem:1431284 fh:13}
    5: &{mem:1627384 fh:13}
    6: &{mem:2152376 fh:13}
    7: &{mem:2199260 fh:13}
    8: &{mem:2526848 fh:13}
    9: &{mem:2526948 fh:13}
    10: &{mem:2809512 fh:13}
    11: &{mem:2809496 fh:13}
    12: &{mem:2809176 fh:13}
    13: &{mem:2809200 fh:13}
    14: &{mem:2810072 fh:13}
    15: &{mem:2809088 fh:13}
    16: &{mem:3334108 fh:13}
    17: &{mem:3616680 fh:13}
    18: &{mem:3615760 fh:13}
    19: &{mem:3615888 fh:13}
    20: &{mem:3615904 fh:13}
clarkmcc commented 5 months ago

FWIW, we're seeing similar behaviors as well but only in our Linux docker containers. We cannot reproduce the issue on macOS. We run several in-memory databases at a time. A single database is only accessed one at a time, but there could be multiple accesses across multiple databases at the same time. We're using this to craft the DSN:

fmt.Sprintf("file:%s?mode=memory", uuid.New().String())

The leak shows up on our Prometheus metrics, but not through pprof.

Edit: with file::memory: I see the same issue.

image

clarkmcc commented 5 months ago

@sthielo in your repo you mention

Lin with WORKAROUND (file based snapshot db) no growth :-)

Can you describe a bit more about how this workaround is implemented? I went through your repo but didn't see multiple types of connections being created.