Closed termermc closed 1 year ago
This may be due to Nim's memory allocator not returning memory back to the OS, because the issue seems to go away with -d:useMalloc
. Is this still an issue?
See below comment
Actually, the problem is less severe with -d:useMalloc
, but if you spam send the server files with many connections, a huge amount of memory is allocated but never freed. You can test using the tool bombardier
(similar to wrk) with the following command:
bombardier -m POST -c 20 -d 10s --body-file '/tmp/100m' http://localhost:8080/
This POSTs the file using 20 connections for a duration of 20 seconds.
The peak memory I observed was ~500MiB, and after the test finished, ~300MiB was remaining and never freed. I'm assuming this has to do with connections that never had data sent not having resources closed for them. I'm not entirely sure about the implementation details, but that's what it looks like to me.
The issue is much worse under refc, and it looks like very little of it is freed in that case.
The underlying issue seems to be that the leak occurs when clients disconnect unexpectedly, because bombardier never succeeded to finish any of its requests.
Experimenting further, I disabled the request timeout in bombardier and I noticed that the leak is not present anymore. This confirmed my suspicion that prematurely terminated requests cause the program to leak memory.
As long as requests aren't terminated during an upload, the program does not leak memory with getBody
or post
.
I'm not seeing any reliable way to check whether the underlying client connection is closed, which may be part of the problem. I've checked the request's HttpState, and it's always Alive. I've also checked read and write stream states, and they're both always Running. As far as I can tell, there's no way for me to check whether a request client connection is closed, which means more opportunities for memory and resource leaks. Is there some other way of determining this?
We'll be looking into this, but a point of order for anyone coming to this thread: ORC/ARC is practically unusable in the 1.6 series, thus this does not constitute a valid use.
So, running the latest chronos and nim 1.6.12 and refc, there is no leak here: ie the memory eventually gets recycled - that's the way the gc works: it leaves a high memory usage watermark because it generally never returns memory to the OS.
However, there are a few things to point out about this code - in particular, no matter what chronos does, it will unavoidably create at least 2 copies of the body (one in the future and one in the local variable) - this is just the way it works and one would have to use the streaming interface instead to get lower memory usage: see getBodyReader here - https://github.com/status-im/nim-chronos/blob/94ca0c38478e47ca806dc912b2a725fc41b532c2/chronos/apps/http/httpserver.nim#L376
I'm gonna keep this open for a few days but it looks like there is no fire here (beyond orc bugs in nim which will not be solved generally until 2.0).
@arnetheduck It leaks on ORC, but not refc, as was stated in the original issue. That has been my experience with it, but it seems using ORC pretty much destroys and non-leaking guarantees that refc had, so I don't know how much control Chronos has over this.
Right, so the thing is that ORC in nim 1.6 is thoroughly broken and all fixes to it go into the devel
branch, where new bugs and leaks in it are discovered weekly - there's nothing that chronos
can do about this.
We're careful under refc not to hold on to references longer than needed thus allowing the GC to work efficiently - in ORC, this should translate to generally lower memory usage than refc because that memory will be releases slightly earlier - however, there's not much we can do in chronos itself (ie beyond recommending the use of a stream and asserting that whatever memory refc uses goes into a pool that gets reused and that we've verified that this reuse works well in long-running applications).
As an aside, I would recommend raising upstream bugs if you find specific leaks in ORC under the latest devel versions - they're addressing any such issues with urgency right now whip it into a good shape ;)
@arnetheduck I didn't mean to waste anyone's time on the Chronos team with this bug item. Back when I submitted it, I assumed that ORC worked just as well as refc, so the only possible cause for the leak as far as I could tell must've been Chronos. Months later, I can say with confidence that I can't trust ORC, and if I had this knowledge when I found the issue, I would have submitted it on the Nim repo instead.
When reading a full request body to memory, ~4x the memory is allocated and is never freed, even after running
GC_runOrc
.Compiler flags:
--threads:on --mm:orc
Nim version:
1.16.12
(latest stable) OS: ArchLinux runninglinux 6.2.9.arch1-1
Reproduction instructions:
Reproduction program:
I've also noticed that simple allocations in the request body also often don't get freed.
Allocating 100MiB with
let str = newString(1024 * 1024 * 100)
doesn't get freed (under refc it actually takes up more space) but doesn't grow past the first request usually. If you allocate 1GiB withlet str = newString(1024 * 1024 * 1024)
though, it immediately gets freed (except on refc, where it stays allocated forever).