nspcc-dev / neofs-node

NeoFS is a decentralized distributed object storage integrated with the Neo blockchain
https://fs.neo.org
GNU General Public License v3.0
32 stars 38 forks source link

OOM during single node load test with 1K size objects #2686

Closed MaxGelbakhiani closed 7 months ago

MaxGelbakhiani commented 9 months ago

OOM during single node load test with 1K size objects

Steps to Reproduce

  1. Create 1 container with default policy "REP 3"
  2. Choose a node that doesn't bear this container. And use it as a single endpoint.
  3. Start K6 load with 1Kb size objects.

As a result, I got OOM. neofs-node process was killed and restarted.

Environment

The setup contains 22 nodes with 1TB of RAM.

neofs-node --version
NeoFS Storage node
Version: 0.39.0
GoVersion: go1.21.1

Syslog and pprof attached. profiles were gathered with 30 seconds interval during the load. OOM allocs.zip OOM_journal.log

cthulhu-rider commented 9 months ago

pprof analysis results:

alloc_space

so the weak place is object slicer. We should optimize it

carpawell commented 9 months ago

https://github.com/nspcc-dev/neofs-sdk-go/blob/6f4790451fa1d7c8756addff45ca2e741e79b3bc/object/slicer/slicer.go#L270-L272

Kinda a scary thing for 1K object payload. I would say it is a mistake. It is hard to handle such a memory load with any memory capacity if we require 64/128MB per every object put.

cthulhu-rider commented 9 months ago

@carpawell thanks 4 ur support, i know about this problem and currently trying to support workaround for it

carpawell commented 9 months ago

@MaxGelbakhiani, can you, please, share the RPC (objects put per sec i mean) you had in your test run?

MaxGelbakhiani commented 9 months ago

@MaxGelbakhiani, can you, please, share the RPC (objects put per sec i mean) you had in your test run?

For this exact test run I didn't have final results showing the rate and RPC as node failed. But it should be similar to these numbers which I got from the other test run with the same setup.

     data_received............: 0 B    0 B/s
     data_sent................: 571 MB 952 kB/s
     iteration_duration.......: avg=107.56ms min=3.35µs  med=93.16ms max=1.24s p(90)=175.32ms p(95)=211.38ms
     iterations...............: 557700 929.300612/s
     neofs_obj_put_duration...: avg=107.12ms min=25.89ms med=92.71ms max=1.24s p(90)=174.79ms p(95)=210.83ms
     neofs_obj_put_total......: 557700 929.300612/s
     vus......................: 100    min=100      max=100
     vus_max..................: 100    min=100      max=100
cthulhu-rider commented 9 months ago

@MaxGelbakhiani could u pls try to test the same scenario on:

MaxGelbakhiani commented 9 months ago

With the provided builds and the same testcase with 1Kb objects load we have the following performance metrics:


     data_received............: 0 B     0 B/s
     data_sent................: 1.1 GB  1.8 MB/s
     iteration_duration.......: avg=57.69ms min=4.62µs  med=48.44ms max=842.1ms  p(90)=91.5ms  p(95)=110ms
     iterations...............: 1039606 1732.480333/s
     neofs_obj_put_duration...: avg=57.23ms min=11.64ms med=47.97ms max=841.65ms p(90)=91.01ms p(95)=109.51ms
     neofs_obj_put_total......: 1039606 1732.480333/s
     vus......................: 100     min=100       max=100
     vus_max..................: 100     min=100       max=100

running (10m00.1s), 000/100 VUs, 1039606 complete and 0 interrupted iterations
write ✓ [ 100% ] 100 VUs  10m0s

No OOMs during 10 min load.

Profiles were sliced each 5 seconds: 19_Dec_OOM_issue_2686_GRPC_1Kb_REP-3_Containers=50_Objects=0_Endpoints=1_Readers=0_Writers=100_Duration=10m.zip

Syslog: syslog_oom_fix_issue_2686.log.zip

MaxGelbakhiani commented 9 months ago

If necessary, I can extend the runtime for this test case to have it longer than 10 minutes.

MaxGelbakhiani commented 9 months ago

Ran a 5 hour test yesterday with neofs-node@2d67e380a3d binary. The run completed successfully with the following results:

     data_received............: 0 B      0 B/s
     data_sent................: 22 GB    1.2 MB/s
     iteration_duration.......: avg=83.94ms min=4.85µs  med=67.41ms max=1m0s   p(90)=125.11ms p(95)=162.09ms
     iterations...............: 21437214 1190.947174/s
     neofs_obj_put_duration...: avg=83.56ms min=11.44ms med=67.03ms max=52.92s p(90)=124.72ms p(95)=161.7ms
     neofs_obj_put_fails......: 1        0.000056/s
     neofs_obj_put_total......: 21437214 1190.947174/s
     vus......................: 100      min=100       max=100
     vus_max..................: 100      min=100       max=100

running (5h00m00.1s), 000/100 VUs, 21437214 complete and 0 interrupted iterations
write ✓ [ 100% ] 100 VUs  5h0m0s

No OOMs during 5 hours.

Profiles were sliced every 2 minutes: 5_hours_profiles_part_1.zip 5_hours_profiles_part_2.zip

Syslog: syslog_oom_fix_issue_2686_5hours.log.zip

cthulhu-rider commented 9 months ago

according to the results (profiles and undying nodes) we can consider fix as working and fixing particular existing problem

with fix, we may also observe other weak places:

carpawell commented 8 months ago

with fix

What fix?

cthulhu-rider commented 8 months ago

What fix?

with any

carpawell commented 8 months ago

I mean, can you provide a PR link or at least a branch name?

cthulhu-rider commented 8 months ago

I mean, can you provide a PR link or at least a branch name?

stick to the revisions, they are mentioned everywhere

carpawell commented 8 months ago

One force push and there is no revision. I am asking not for me, I have found everything I need, I am asking for better issue history and reproducing.

cthulhu-rider commented 8 months ago

these tests were pure experimental, i dont recommend to try to reproduce them cuz we haven't recorded the cluster setup

PR is comin

cthulhu-rider commented 7 months ago

@MaxGelbakhiani can we pls test the https://github.com/nspcc-dev/neofs-node/issues/2686#issuecomment-1863225737 with following revisions:

with 1K objects. Memory profiles are still needed ofc

roman-khimov commented 7 months ago

Fixed by #2719?

cthulhu-rider commented 7 months ago

afaik @MaxGelbakhiani tested #2719 with much smaller RAM nodes and OOM didnt happen, right?

MaxGelbakhiani commented 7 months ago

Test run with #2719 was issued on nodes with 64GB of RAM. 20-minute test run ended up with a performance boost and with no OOMs during the test.