scylladb / scylla-monitoring

Simple monitoring of Scylla with Grafana
https://scylladb.github.io/scylla-monitoring/
Apache License 2.0
229 stars 139 forks source link

Alternator OPs are not representative of real ops - in case of BatchGetItem and similar batch ops #2374

Open mykaul opened 1 month ago

mykaul commented 1 month ago

A customer is running BatchGetItem with 100 items in each batch. It looks as if they have less <1K OPs of batch items, but the overall result is 100K OPs... The visualization is correct per OPs, but is misleading somewhat in terms of how many OPs are actually being run against the cluster. Did not look at the code of where we count such OPs, but I'd argue we should count the overall OPs, not 1 per batch. @tzach , @nyh , @nuivall - thoughts?

mykaul commented 1 month ago

Perhaps instead of counting it here: https://github.com/scylladb/scylladb/blob/849856b96407f3a79811c6aa4be98a38b6b2df8f/alternator/executor.cc#L3332 we should look at requests.size around https://github.com/scylladb/scylladb/blob/849856b96407f3a79811c6aa4be98a38b6b2df8f/alternator/executor.cc#L3386

nuivall commented 2 weeks ago

New metrics were added in https://github.com/scylladb/scylladb/commit/390e01673be1f365e42f826490ed95aeede1a259

amnonh commented 2 days ago

I'm going to add a batch size with the batch count