scylladb / scylla-monitoring

Simple monitoring of Scylla with Grafana
https://scylladb.github.io/scylla-monitoring/
Apache License 2.0
225 stars 138 forks source link

Read amplification indicators #2014

Open avikivity opened 1 year ago

avikivity commented 1 year ago

Let's add two indicators to the overview page:

Read amplification: rate of sum of query-class iops, divided by rate of replica-side reads Post-cache read amplification: same, but only considering queries that missed the cache.

tzach commented 1 year ago

Is it a better fit for the advance dashboard? Amplification is a challenging concept to understand. What are valid, good, or bad values for this metric?

amnonh commented 7 months ago

@michoecho can you take a look at this issue if you can suggest the relevant metrics.

michoecho commented 7 months ago

It depends on what exactly we want.

@avikivity suggested

sum by (instance, shard) (rate(scylla_io_queue_total_read_ops{class=~"sl:.*|query"}[1m])) / sum by (instance, shard) (rate(scylla_database_total_reads{class="user"}[1m]))
or, depending on what you want
sum by (instance, shard) (rate(scylla_io_queue_total_read_ops{class=~"sl:.*|query"}[1m])) / sum by (instance, shard) (rate(scylla_cache_reads[1m]))

and

sum by (instance, shard) (rate(scylla_io_queue_total_read_ops{class=~"sl:.*|query"}[1m])) / sum by (instance, shard) (rate(scylla_cache_reads_with_misses[1m]))

But note that the presence of any range scans, or any BYPASS CACHE queries, and probably any of 10 other unknown things, will make the results ambiguous. If we decide on what we want the "read amplification" metrics to precisely mean, maybe we can export enough metrics from the server side to make it robust, but with what we have right now I think it would be rather fragile.

amnonh commented 7 months ago

@avikivity can you take a look, I want to see what can be included in the next monitoring release

amnonh commented 5 months ago

@avikivity ping

amnonh commented 4 months ago

@avikivity ping

amnonh commented 4 months ago

@avikivity continues with what @michoecho wrote. The core question is, what are you looking to see? If it's the compaction strategy overhead, it would be at the sstable reader level, and would be best model with a histogram of how many sstable reads per item retrieved. In that case, we should be agnostic to cache use and range scans.

On the other side of that scale, we have how many sstables-reads are in the system compared to how many requests (retrieved items?). This will be impacted by cache use, range scan, and allow filtering (and maybe other things that we don't think of)

avikivity commented 4 months ago

I think we need both. One gauge that divides sstable reads by replica reads that missed cache to obtain log-structured merge tree amplification, and another that divides query I/O by sstable reads to obtain average I/O per sstable. If it can be made per service level, much better, but I don't think we have enough selectivity.

avikivity commented 4 months ago

(could be exposed as a histogram but I'm not thrilled with loading more metrics)

amnonh commented 4 months ago

@avikivity, how do you address range queries in these cases?

amnonh commented 4 months ago

@avikivity, the issue with range queries remains. It sounds like what we really want is a metric on the sstable level that will calculate explicitly the number of sstables per read, instead of trying to estimate it.

@michoecho do we have anything like that? Can we add it?

michoecho commented 4 months ago

@michoecho do we have anything like that?

No.

Can we add it?

We could. It would require propagating some stats& struct through the reader creation APIs. Which I think is a good idea in general; I think per-query stats should be a given. But this would require some work.