Open aikoven opened 3 years ago
Hm - thanks for the report. Thats a worry. It probably won't matter but - what versions of nodejs / fdb are you using, and on what OS?
Ah, right, sorry. I'm using node:12
docker image, which is based on Debian (I guess).
# uname -a
Linux 48d478a21ee4 5.4.39-linuxkit #1 SMP Fri May 8 23:03:06 UTC 2020 x86_64 GNU/Linux
FDB server and client are of version 6.2.28
.
Yep, confirmed it leaks. It leaks way faster without the tn.get() call:
const fdb = require('foundationdb')
fdb.setAPIVersion(620)
const db = fdb.open()
;(async () => {
while (true) {
await db.doTn(async tn => {
});
}
})()
I've tracked down the issue and seem to have fixed it. I wasn't freeing a temporary heap allocated structure after transactions were passed to libfdb_c.
📦 foundationdb@1.0.3. See if that fixes things!
With 1.0.3
, I still observe growing memory consumption.
Without the tn.get('')
call, it reaches 1.5Gb within 1 minute. With tn.get
it grows from 200Mb to 450Mb in 30 minutes and continues to grow.
Can you try checking out from source to see if you can reproduce the behaviour? That sounds like the behaviour I was seeing in 1.0.2. Maybe something went wrong in the publishing process? (I rewrote the node module publishing scripts yesterday.)
I'm still seeing a (much slower) memory leak in 1.0.3. Without the tn.get() call, in about a minute process memory usage goes from 18mb to 24mb. Valgrind doesn't show the leak, regardless of how many iterations I run the process for. I'm wondering if maybe there's some internal FDB structures we aren't cleaning up properly, and the foundationdb C library is deallocating that memory automatically when its thread is terminated (hence those allocations not showing up in valgrind).
This was the valgrind stats from running 10M iterations (with no tn.get
call).
==36832== HEAP SUMMARY:
==36832== in use at exit: 201,613 bytes in 834 blocks
==36832== total heap usage: 14,881,904 allocs, 14,881,070 frees, 836,096,099 bytes allocated
...
==36832== LEAK SUMMARY:
==36832== definitely lost: 64 bytes in 1 blocks
==36832== indirectly lost: 0 bytes in 0 blocks
==36832== possibly lost: 304 bytes in 1 blocks
==36832== still reachable: 201,245 bytes in 832 blocks
==36832== of which reachable via heuristic:
==36832== stdstring : 18,155 bytes in 313 blocks
==36832== multipleinheritance: 48 bytes in 1 blocks
==36832== suppressed: 0 bytes in 0 blocks
==36832== Reachable blocks (those to which a pointer was found) are not shown.
==36832== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==36832==
==36832== For lists of detected and suppressed errors, rerun with: -s
==36832== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
Here's what I did:
rm -rf node_modules/foundationdb/prebuilds
Checked that the code would crash, then rebuilt binaries using
cd node_modules/foundationdb
yarn node-gyp-build
I didn't see any change after that, though I did some more experimenting:
I ran my test for longer time and figured that I was wrong: with 1.0.3
the process memory settles around 3.3Gb and doesn't grow indefinitely.
I also noticed that NodeJS heap grows pretty fast at first, so I tried to collect allocation timeline profile. For some reason, this resulted in both NodeJS heap and process memory stay at consistent levels, while I collect the profile. Here are some examples:
Starting profiling right away:
0s NodeJS heap: 89.08MB Process memory: 185MB
10s NodeJS heap: 79.06MB Process memory: 215.2MB
20s NodeJS heap: 79.06MB Process memory: 215.2MB
30s NodeJS heap: 79.06MB Process memory: 215.2MB
40s NodeJS heap: 79.06MB Process memory: 215.2MB
50s NodeJS heap: 79.06MB Process memory: 215.2MB
60s NodeJS heap: 79.06MB Process memory: 215.2MB
Starting profiling at 30s, running for 30s:
0s NodeJS heap: 87.97MB Process memory: 175.39MB
10s NodeJS heap: 154.61MB Process memory: 542.76MB
20s NodeJS heap: 143.91MB Process memory: 725.77MB
30s NodeJS heap: 145.48MB Process memory: 750.65MB
# profiling started
40s NodeJS heap: 79.04MB Process memory: 711.21MB
50s NodeJS heap: 79.04MB Process memory: 711.21MB
60s NodeJS heap: 79.04MB Process memory: 711.21MB
# profiling stopped
70s NodeJS heap: 79.04MB Process memory: 788.95MB
80s NodeJS heap: 89.4MB Process memory: 828.09MB
90s NodeJS heap: 92.1MB Process memory: 826.71MB
This looks pretty weird, it probably has something to do with GC, but forcing GC using debugger doesn't free all the consumed memory. Also, I don't see any leaked objects in NodeJS heap.
I ran the test against NodeJS 14, and it seems to consume memory much slower:
0s NodeJS heap: 87.6MB Process memory: 183.71MB
10s NodeJS heap: 116.67MB Process memory: 304.7MB
20s NodeJS heap: 82.5MB Process memory: 348.34MB
30s NodeJS heap: 91.73MB Process memory: 366.36MB
40s NodeJS heap: 91.87MB Process memory: 369.73MB
50s NodeJS heap: 90.61MB Process memory: 371.93MB
60s NodeJS heap: 103.81MB Process memory: 379.21MB
My hypothesis is that under heavy load NodeJS triggers GC less often; and retained NodeJS objects lead to attached native objects being retained as well, making some internal FDB client buffer grow.
Anyway, let me test 1.0.3
with our production services. Will check back once I have data.
Thank you for your time!
Huh, yeah - maybe something like that. I don't think memory usage should grow as it is though. You're definitely using 1.0.3? (Check node_modules/foundationdb/package.json
)
I bet if I rewrote the same test in raw C it would sit at a constant amount of memory. ... I might try tracking the number of allocated FDB Futures and see what happens to that number over time
You're definitely using 1.0.3? (Check
node_modules/foundationdb/package.json
)
Yep, did just that.
After running 1.0.3
on one of our production services for a couple days, here's what I've got:
It can be seen that the major leak went away. The memory still grows, but really slow.
NodeJS heap:
Another observation: during load bursts, the consumed memory sometimes grows, and not being released:
May have something to do with FDB client's Fast Allocator (see https://forums.foundationdb.org/t/memory-leak-in-c-api/554/2), so this may be expected.
I’m glad that helped. The memory leak I fixed for 1.0.3 looked pretty significant, and I was surprised the other day when it sounded like it didn’t make much difference.
And hm, it could be to do with the allocator. But I’d expect those simple loops to sit on a single fdb transaction and reuse it every time. That’s definitely not the full story.
Hi! We noticed a memory leak when using watch
Repo for debug: https://github.com/Snowman2908/node-foundationdb-watch-memory-leak
Starting profiling:
app-1 | $ /node_modules/.bin/ts-node index.ts
app-1 | NodeJS heap: 94.99MB
app-1 | Process memory: 290.81MB
app-1 | NodeJS heap: 99.58MB
app-1 | Process memory: 305.81MB
app-1 | NodeJS heap: 102.77MB
app-1 | Process memory: 321.31MB
app-1 | NodeJS heap: 105.6MB
End profiling after 10 min:
app-1 | Process memory: 949.81MB
app-1 | NodeJS heap: 95.78MB
app-1 | Process memory: 963.19MB
app-1 | NodeJS heap: 97.89MB
app-1 | Process memory: 976.06MB
app-1 | NodeJS heap: 97.07MB
app-1 | Process memory: 988.94MB
app-1 | NodeJS heap: 96.43MB
app-1 | Process memory: 1002.56MB
app-1 | NodeJS heap: 100.89MB
app-1 | Process memory: 1016.69MB
The situation is similar to what happened with tn
, NodeJS heap size stays at consistent levels, but the total process memory grows
I use macOS with rosetta, but we have the same situation with a memory leak in our services in Kubernetes
We noticed a memory leak in our services that use
node-foundationdb
, which appears to be somewhere in native code, because NodeJS heap size stays at consistent levels, but the total process memory grows.I made a repro, see https://github.com/aikoven/node-foundationdb-memory-leak-repro
This repro makes simple read-only transactions in a loop:
After about an hour, the process consumed about 700MB of memory, while NodeJS heap usage was consistently ~90MB.