We have a lot of metrics for the whole concurrent ingestion/fetching. I tried to add panels for either high-level metrics such as throughput and latency or places in the code where we rely heavily on assumptions (i.e. where we do estimations).
Added panels
latency (time to process a batch or time to received a fetched batch)
decompressed throughput from Kafka
actual and estimated ingested samples
actual and estimated bytes per record
Modified code
I figured it'd be useful to see exemplars for when processing and batch wait time time take a long time, so I made small code changes too
Before
After [^1]
[^1]: Estimated bytes per records aren't working because the metric for that is in r318. R318 still hasn't been released in the environment I was testing against.
Checklist
[x] Tests updated.
[n/a] Documentation added.
[ ] CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
What this PR does
We have a lot of metrics for the whole concurrent ingestion/fetching. I tried to add panels for either high-level metrics such as throughput and latency or places in the code where we rely heavily on assumptions (i.e. where we do estimations).
Added panels
Modified code
Before
After [^1]
[^1]: Estimated bytes per records aren't working because the metric for that is in r318. R318 still hasn't been released in the environment I was testing against.
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.