anoma / namada

Rust implementation of Namada, a Proof-of-Stake L1 for interchain asset-agnostic privacy
https://namada.net
GNU General Public License v3.0
2.39k stars 948 forks source link

High resource allocation for namadan #2950

Closed opsecx closed 1 month ago

opsecx commented 6 months ago

While getting the aforementioned timeout-errors from the client, I noticed that often while that happens, namadan show up in "top" with allocated 13g in virt and 10g in res. Is this expected behavior? apologies if by design

Fraccaman commented 6 months ago

our nodes are running with 8gb of ram and 5.5 are constantly in use. Do u mind sharing your machine resources?

opsecx commented 6 months ago

our nodes are running with 8gb of ram and 5.5 are constantly in use. Do u mind sharing your machine resources?

sure, got two separate vps running (one crew one pilot), one has 48GB ram with I think 12 vpu, the other has 24GB with 8? vpu. the lower specced one is the one with issues. But they both display this amount of virt mem allocated which seems odd to me. The lower specced one often fails requests from client (had to upgrade from 16GB as that didn't work well). Higher specced one not so much.

opsecx commented 6 months ago

our nodes are running with 8gb of ram and 5.5 are constantly in use. Do u mind sharing your machine resources?

Are these validators or nodes?

Fraccaman commented 6 months ago

I think this could be related to #2955, as we are seeing client request failing on our infra too, but memory is not being swapped or growing.

opsecx commented 6 months ago

I'm starting to wonder if the issue has to do with different dimensionings of my network interfaces. That could be possible. But still the RAM alloc kinda bothers me. As there's about 13g actual free on avg but still it sits in virtmem.

opsecx commented 6 months ago

Again today, my 24GB specced server suddenly halts (namadan, not the server) after attempted shielded sync, and I have to reboot the entire thing since namadan isn't shutting down gracefully. I really do think there are some hidden performance issues.

opsecx commented 5 months ago

Looking now a bit later, I see current allocations to namadan are around 20GB. There seems to be an increasing resource use as chain moves along. Could there be something akin to a leak somewhere?

cwgoes commented 1 month ago

Thanks for the report. This should have since been fixed.