akash-network / support

Akash Support and Issue Tracking
5 stars 3 forks source link

archival akash node consumes high mem on certain query/queries #195

Open andy108369 opened 3 months ago

andy108369 commented 3 months ago

Archival Akash node 0.30.0 (managed by Shimpa)

archival akash node typically consumes about 26 GiB of RAM. (on Shimpa's and my archival akash node)

root@akash-archive:~# free -m
               total        used        free      shared  buff/cache   available
Mem:          128805       27891         798           0      100115       99685
Swap:           4095          10        4085

It appears that on certain query/queries it eats up whole memory servers has until Linux OOM kills it.

Queries:

Multiple getTxsEvent(MsgDelegate, MsgUndelegate, MsgExec) queries were run - This happens when the client falls behind in indexing. The responses for these queries are paginated, so, our client was looking for maybe page 100 of 105 pages.

If you need more info regarding the queries, please ping cryptoninja1234 on Discord.

OOM:

it runs whole way up to 118 GiB where OOM kills it:

[Wed Mar 13 05:49:40 2024] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=docker-a1e02df52f5f9108e2ade5090ce89ebb25abb8cce561f87239f096b39c9b5160.scope,mems_allowed=0,global_oom,task_memcg=/system.slice/docker-a1e02df52f5f9108e2ade5090ce89ebb25abb8cce561f87239f096b39c9b5160.scope,task=akash,pid=3044672,uid=0
[Wed Mar 13 05:49:40 2024] Out of memory: Killed process 3044672 (akash) total-vm:132212608kB, anon-rss:124655832kB, file-rss:384kB, shmem-rss:0kB, UID:0 pgtables:252652kB oom_score_adj:0
andy108369 commented 3 months ago

Shimpa (shimpa on Discord) increased RAM from 128 GiB to 192 GiB and cryptoninja1234 (on Discord) will retry his queries.

arno01 commented 3 months ago

archival node was upgraded to v0.32.1 and the client has been up since about 08:35 AM (multiple getTxsEvent(MsgDelegate, MsgUndelegate, MsgExec) queries)

Update 1 akash is consuming 104.1 GiB of RAM currently: image

Update 2 143.9 GiB now: image

Update 3 157 GiB: image

Update 4 returned back to 115.1 GiB: image

chainzero commented 3 months ago

While issue was cured by increased RAM allocation - leaving open for @troian review.

andy108369 commented 3 months ago

I'll open a discussion for this one and going to close this issue.