Heap analysis using `heaptracker`

jm-clius commented 2 years ago

Background

Using @Menduist's branch and instructions, it's possible to use heaptrack on Nim programs.

This is invaluable not only for debugging (e.g. investigating memory leaks), but to understand the general performance and memory inefficiencies in a running node.

Suggested effort

Use heaptrack while running nodes under different scenarios and with different configurations, e.g.

store nodes with many connections and high traffic rates
resource-restricted nodes running only filter and lightpush

Based on the heaptrack output

document memory requirements (as rough guideline) for certain traffic and network conditions
identify areas where we can optimise memory usage and address inefficiencies
identify (and fix) any as-yet-unknown (perhaps slow) memory leaks :)

Ivansete-status commented 1 year ago

As agreed with @jm-clius , we will perform a heaptrack analysis once the next is sorted out: https://github.com/waku-org/nwaku/issues/1662

Also, it is important to write a hackmd documentation guide describing how to properly use the heaptrack tool and how to prepare the nwaku Docker container with heaptrack enabled.

fryorcraken commented 11 months ago

Is that a task or a milestone? Can it be a task part of a larger "profiling" milestone or so?

jm-clius commented 11 months ago

Agreed that this could be part of a larger profiling milestone for nwaku. Some immediate things that I can think about that could form part of such a milestone:

evaluate impact of different ways of running the garbage collector (i.e. can we improve memory usage by just triggering gc manually periodically)
find and remove blocking tasks (e.g. DB tasks)
evaluate CPU and memory usage for different run scenarios (relay, store, filter, etc)
evaluate CPU and memory usage for different transport protocols (tcp, websockets)
etc.

cc @Ivansete-status @NagyZoltanPeter

NagyZoltanPeter commented 11 months ago

I can agree, it is a cool effort for level-up efficiency. Also I can view it is a part of 1M epic. Although output of such a task can only be inputs for tasks that can be bound to milestone.

Most probably this task is a rolling maintenance type of effort or I can think of direct output from it, is a reproducible benchmark suit. Maybe built into ci. @fryorcraken, @jm-clius, @Ivansete-status WDYT?

jm-clius commented 11 months ago

I'd say we want to keep the amount of "always open" maintenance tasks limited to as few as possible. I'd rather define some concrete investigation tasks here and close the milestone once we've done each of these tasks and drawn conclusions from them. Output likely to be other action items (e.g. "parallellise DB cleanups", "improve nim gc use"), but those could go under a different "improvement" milestone, also with concrete, closeable tasks.

One very good output would indeed be regular benchmark checks in CI! Good idea, as this can show us major deviation in e.g. memory/bandwidth use after a specific merge. But I'd simply have that as a subtask to this milestone.

Ivansete-status commented 11 months ago

I can agree, it is a cool effort for level-up efficiency. Also I can view it is a part of 1M epic. Although output of such a task can only be inputs for tasks that can be bound to milestone.

Most probably this task is a rolling maintenance type of effort or I can think of direct output from it, is a reproducible benchmark suit. Maybe built into ci. @fryorcraken, @jm-clius, @Ivansete-status WDYT?

I think we need to have a maintenance milestone per each release. In there, we can add this heaptrack task and all the evaluation tools that you mentioned @jm-clius. Those maintenance issues can gather any important refactoring that we may do (e.g. removing the v1) ( cc @vpavlin )

fryorcraken commented 11 months ago

For heaptracker I can see two types of tasks coming out of it:

improving the status quo. This can be part of this larger "optimization" milestone that could be part of the 1mil epic. The actual work would be part of the measurement
Running heaptracker as part of release process. I wonder if this could be part of the non-regression performance testing track of DST. Let's see we run a fleet of 100 nodes to do performance comparisons between releases. We could ahve a couple of node with heaptracker enabled to double check we did not introduce any memory leak in a release.

fryorcraken commented 11 months ago

I think we need to have a maintenance milestone per each release. In there, we can add this heaptrack task and all the evaluation tools that you mentioned @jm-clius. Those maintenance issues can gather any important refactoring that we may do (e.g. removing the v1) ( cc @vpavlin )

Maybe best to have a template for the "Release" issues and in the template one can have pre-defined check boxes such ash "run heaptracker on releaes candidate".

waku-org / nwaku

Heap analysis using `heaptracker` #986

Background

Suggested effort