graphops / subgraph-radio

Gossip about Subgraphs with other Graph Protocol Indexers
docs.graphops.xyz/graphcast/radios/subgraph-radio/intro
Apache License 2.0
8 stars 1 forks source link

Get to the bottom of all the memory leaks #150

Open pete-eiger opened 5 months ago

pete-eiger commented 5 months ago

Thanks to https://github.com/graphops/subgraph-radio/issues/147 , we discovered that native-tls is causing a lot of memory issues in Subgraph Radio, and while it seems that it's the main culprit so far, even after removing it we get this info from heaptrack:

total runtime: 394.24s.
calls to allocation functions: 2470456 (6266/s)
temporary memory allocations: 532012 (1349/s)
peak heap memory consumption: 26.11M
peak RSS (including heaptrack overhead): 136.47M
total memory leaked: 3.57M

This is a huge improvement over the previous report (with native-tls still in the mix), but we should get to the bottom of all the leaks and suspiciously high allocations.

aasseman commented 5 months ago

On my side, the update to 1.0.5 (which removed native-tls) didn't improve things significantly: image

pete-eiger commented 5 months ago

On my side, the update to 1.0.5 (which removed native-tls) didn't improve things significantly: image

thank you @aasseman , we are seeing something similar:

image

But on the bright side, it looks like memory usage is stabilizing after some time, and the Radio is effectively freeing up memory too. The usage stopped climbing up endlessly. There are still problems though, as indicated by heaptrack, we'll get to the bottom of them soon 🎯

aasseman commented 4 months ago

It's stable until it isn't (still on 1.0.5): image

pete-eiger commented 4 months ago

It's stable until it isn't (still on 1.0.5): image

darn, I'll take a look at how our instance is doing and report back

pete-eiger commented 4 months ago

@aasseman you should definitely update to 1.0.6 though, since that removes the dependency on native-tls entirely in the SDK