Open okdistribute opened 3 years ago
This is great @okdistribute.
I poked at this a bit and wrote a patch! I noticed something big first: my machine can't even allocate 500 multifeeds in memory. Allocating and syncing everything at once also overwhelmed my computer, so I changed run-parallel
for async-series
.
I also made the numbers a bit smaller, and used process.memoryUsage()
to try and see how much heap the syncs were sucking up, instead of relying on the process to get killed. (My computer doesn't seem to nicely kill it, instead it sucks up my swap and freezes everything.)
From there, I noticed that the memory use goes up non-linearly with the number of syncs. For example, 10 syncs uses 34mb of heap, 20 uses 114mb, and 40 syncs uses 447mb.
I didn't want to push on top of your work, and have been wanting to mess with non-github workflows, so here's a patch you can apply locally to see if you like it:
curl -sL http://eight45.net/pub/0001-multifeed.patch | git am
Thanks @noffle! Feel free to push the patch here, I agree about smaller increments, I picked a big number out of laziness 😅
@okdistribute this is a great example of how easy it is to bloat a multifeed
with feeds! It's clear when you add more peers the memory usage balloons.
I did some poking around and think this test might be what you want to test as it simulates multiple peers in one program (and hence one heap). So the final heapUsed
value includes all the simulated peers, aka the entire network. Here's my analysis to explain:
@noffle was right that the memory goes up non-linearly with the number of multifeed
peers. Looking at the heap snapshot I found that ~89% of the memory used was retained by Feed
s AKA hypercore
s. So I did a series of tests and got the following results:
Multifeeds (n ) |
Feeds (f ) |
post sync memory |
---|---|---|
2 | 6 | 9.438872 mb |
4 | 17 | 15.238376 mb |
10 | 74 | 46.004 mb |
20 | 249 | 146.085864 mb |
30 | 524 | 290.418824 mb |
40 | 899 | 492.932712 mb |
See [1] for how f
is correlated to n
I believe that not all multifeeds should be included in the memory results and instead each multifeed
should be deleted once it's replicated to one persistent multifeed
to simulate transient peers connecting. Here is an example patch that deletes all but the fully synced multifeed
:
console.log('pre-sync heap', process.memoryUsage().heapUsed / 1000000, 'mb')
series(replications, (err) => {
if (err) throw err
console.log('replicated! everything!')
console.log('pre-closing heap', process.memoryUsage().heapUsed / 1000000, 'mb')
let keep = feeds[0]
let closures = []
for (let i = feeds.length - 1; i > 0; i--) {
closures.push((done) => {
let a = feeds[i]
delete feeds[i]
a.close(done)
})
}
series(closures, (err) => {
if (err) throw err
global.gc()
console.log('keep.feeds().length', keep.feeds().length)
console.log('post-closing heap', process.memoryUsage().heapUsed / 1000000, 'mb')
debugger
t.end()
})
})
This requires the --expose-gc
flag when running node, but without that the heapUsed includes multifeed
s without references. Even so the garbage collection doesn't get every floating reference as I have approximately 24 Feed
s still in the heap at a the breakpoint above. Eventually after multiple GCs it goes down to the expected 21 Feed
s. The resulting 'post-closing' heapUsed
for 20 multifeed
s total is 20.720344 mb
compared to the 'pre-closing' of 147.788704 mb
. I get similar result with other n
s:
Multifeeds (n ) |
Feeds (f ) |
post closing memory |
---|---|---|
10 | 14 | 14.090496 mb |
20 | 24 | 20.720344 mb |
30 | 34 | 27.0382 mb |
40 | 43 | 33.731832 mb |
Sorry for the tome, but I hope this helps! Right now having a middleware layer would be awesome as weeding out feeds that a peer doesn't want to host would be awesome. Alternatively, people can save a bit on memory if the storage is file based (though I do believe memory still increases over time with random-access-file
as the storage.
[1] The number of feed
s (f
) goes up much quicker than the number of multifeed
s (n
), which makes sense given they are replicating feeds to one another. But I expected f
to be f = n² + n
since every multifeed
has 1 feed
as the root and then n
feeds, one for each multifeed
it gets via replication including itself. Instead f = 2 * n + ( 1 + 2 + ... + n - 1) + n - 1
with the 2 * n
standing for the root and local feed for each node and the (1 + 2 + ... + n - 1 ) + n -1
standing for the feeds replicated other than itself. This sum comes from the fact that the replication is in series done with only the previous and next node if they are available. So the first node only gets the second nodes feeds and each node adds one feed to the running total. The last node is the one exception as it has no following node to sync with.
The multifeed
s might have fully replicated prior to @noffle's reworking to run them in series as the replication might have persisted long enough for the feeds to propagate out. One potential solution is to sync the last node to be synced with the other nodes it wasn't synced with to ensure they all have the same feeds. Though this may not matter if we narrow the test to one multifeed
.
mafintosh says:
@cblgh you might wanna rerun that leak test tho, we have no known leaks in [hyper-]core since 1-2 months ago
& a comment from @rangermauve
I was trying to find a leak in multifeed actually, but I couldn't find anything when I did heap dumps. Like, I had it running for a while and connecting and disconnecting periodically, but it didn't look like anything new was being kept other than system garbage like compiled code. :person_shrugging:
This is a failing test for the memory leak. System will print
Killed
after less than 30 seconds. Check/var/log/syslog
for memory leak message. Related to https://github.com/cabal-club/commons/issues/15If the issue is having too many hypercores open at once, an approach like a manifest and request middleware might also fix the memory leak issue for most production use cases: https://github.com/kappa-db/multifeed/pull/26
Or, some way to replicate in batches, and close the feeds that aren't actively replicating.
To mimic real life scenario for cabal, about 70% of the feeds are empty in this test -- only 30% of them have 1 log entry.