waku-org / nwaku

Waku node and protocol.
Other
188 stars 49 forks source link

Heap analysis using `heaptracker` #986

Open jm-clius opened 2 years ago

jm-clius commented 2 years ago

Background

Using @Menduist's branch and instructions, it's possible to use heaptrack on Nim programs.

This is invaluable not only for debugging (e.g. investigating memory leaks), but to understand the general performance and memory inefficiencies in a running node.

Suggested effort

Use heaptrack while running nodes under different scenarios and with different configurations, e.g.

Based on the heaptrack output

Ivansete-status commented 1 year ago

As agreed with @jm-clius , we will perform a heaptrack analysis once the next is sorted out: https://github.com/waku-org/nwaku/issues/1662

Also, it is important to write a hackmd documentation guide describing how to properly use the heaptrack tool and how to prepare the nwaku Docker container with heaptrack enabled.

fryorcraken commented 11 months ago

Is that a task or a milestone? Can it be a task part of a larger "profiling" milestone or so?

jm-clius commented 11 months ago

Agreed that this could be part of a larger profiling milestone for nwaku. Some immediate things that I can think about that could form part of such a milestone:

cc @Ivansete-status @NagyZoltanPeter

NagyZoltanPeter commented 11 months ago

I can agree, it is a cool effort for level-up efficiency. Also I can view it is a part of 1M epic. Although output of such a task can only be inputs for tasks that can be bound to milestone.

Most probably this task is a rolling maintenance type of effort or I can think of direct output from it, is a reproducible benchmark suit. Maybe built into ci. @fryorcraken, @jm-clius, @Ivansete-status WDYT?

jm-clius commented 11 months ago

I'd say we want to keep the amount of "always open" maintenance tasks limited to as few as possible. I'd rather define some concrete investigation tasks here and close the milestone once we've done each of these tasks and drawn conclusions from them. Output likely to be other action items (e.g. "parallellise DB cleanups", "improve nim gc use"), but those could go under a different "improvement" milestone, also with concrete, closeable tasks.

One very good output would indeed be regular benchmark checks in CI! Good idea, as this can show us major deviation in e.g. memory/bandwidth use after a specific merge. But I'd simply have that as a subtask to this milestone.

Ivansete-status commented 11 months ago

I can agree, it is a cool effort for level-up efficiency. Also I can view it is a part of 1M epic. Although output of such a task can only be inputs for tasks that can be bound to milestone.

Most probably this task is a rolling maintenance type of effort or I can think of direct output from it, is a reproducible benchmark suit. Maybe built into ci. @fryorcraken, @jm-clius, @Ivansete-status WDYT?

I think we need to have a maintenance milestone per each release. In there, we can add this heaptrack task and all the evaluation tools that you mentioned @jm-clius. Those maintenance issues can gather any important refactoring that we may do (e.g. removing the v1) ( cc @vpavlin )

fryorcraken commented 11 months ago

For heaptracker I can see two types of tasks coming out of it:

  1. improving the status quo. This can be part of this larger "optimization" milestone that could be part of the 1mil epic. The actual work would be part of the measurement
  2. Running heaptracker as part of release process. I wonder if this could be part of the non-regression performance testing track of DST. Let's see we run a fleet of 100 nodes to do performance comparisons between releases. We could ahve a couple of node with heaptracker enabled to double check we did not introduce any memory leak in a release.
fryorcraken commented 11 months ago

I think we need to have a maintenance milestone per each release. In there, we can add this heaptrack task and all the evaluation tools that you mentioned @jm-clius. Those maintenance issues can gather any important refactoring that we may do (e.g. removing the v1) ( cc @vpavlin )

Maybe best to have a template for the "Release" issues and in the template one can have pre-defined check boxes such ash "run heaptracker on releaes candidate".