Closed peter-mount closed 5 years ago
It's definitely the fault of LDB but it could be an issue with memory fragmentation & how go's GC operates.
I've got both the d3 & ldb services recording memory use into Graphite now & it's clear that it's LDB's fault.
Even after a full resync from darwin the D3 service only hits 469MB of memory use, but LDB hits 3.7GB!
The problem seems to be where, in LDB when things settle down it's only using just under 25MB of heap, but the RSS is still at 3.7GB and the OS (Linux/Docker in this instance) doesn't claim it back so the server memory use is still high, as is swap as pages get pushed to swap.
This issue had a big learning curve. Due to how golang handles it's heap, specifically structs with pointers, the heap grew at times when the system was underload, e.g. a full resync or during the morning snapshot updates. Also changed what is stored per service at each station so that it's storing only what it needs & getting the full data at request time reduced the memory footprint even more.
Whilst the v16 feed was running for the first time tonight I've noticed this:
So although ldb was running, something isn't right