MaterializeInc / materialize

The Cloud Operational Data Store: use SQL to transform, deliver, and act on fast-changing data.
https://materialize.com
Other
5.71k stars 465 forks source link

UX: Show latest available kafka offset in mz_source_info to allow calculating the lag with sql #4985

Open krishmanoh2 opened 3 years ago

krishmanoh2 commented 3 years ago

It is hard to calculate source lag with sql on mz today.

For kafka as source, if we can show the latest available kafka offset (committed/uncommitted), would help customers/prospects calculate lag. Ideally, we would also show throughput and latency in this view so the customer can determine how long it will take to catch up (if there is a lag).

elindsey commented 3 years ago

This will likely be done opportunistically within the next couple of months, after the exactly once and idempotent ingest tasks wrap up. Currently viewing it as low priority, but will revisit if we have a fuller understanding of customer need (specifically how customers are currently monitoring consumer lag and why that paired with the existing dataflow lag metrics are not sufficient).

krishmanoh2 commented 3 years ago

Customers are relying on visual cues - materialize prometheus metrics in grafana to calculate lag.

elindsey commented 3 years ago

Few notes from recent discussions: mz has a statistics_interval_ms that can be set on kafka source creation to dump detailed statistics into the materialized.log. That's been broken for awhile, but is now working on the latest main branch. The likely path forward is to see if those metrics can be recorded in a system table instead of just written to the log file - potential complications are around the overhead of that stats collection since it incurs some synchronization, a number of hdrhistograms, and json ser/deser.