scylladb / scylla-cluster-tests

Tests for Scylla Clusters
GNU Affero General Public License v3.0
57 stars 94 forks source link

Collect perf record on specific events #6309

Open soyacz opened 1 year ago

soyacz commented 1 year ago

In some cases, details we provide in issues are not sufficient for investigation. Developers need e.g. perf record for specific cpu (shard) to be able find a root cause. Because we have means to trigger some additional checks on events (EventsHandler that monitors events stream and in case of some event appearing does some action), we can get perf record on certain events and provide them along with the issues. Thou there are questions:

  1. What is the trigger
  2. What is algorithm for finding culprit node/shard
  3. What is specific command to use for perf record collection (https://opensource.docs.scylladb.com/stable/kb/use-perf.html is not explaining it enough)

Example request for this is stated here: https://github.com/scylladb/scylladb/issues/13759#issuecomment-1578656454 For this one it was taken manually (sudo perf record -C 7 --call-graph=dwarf -F 99 -p $(pidof scylla))

fruch commented 1 year ago

We would like to know how often this might be needed, and how hard it to do this sequence on an issue that isn't happens constantly, and would it worth our while to automated.

So feedback from core team would help a lot around this.