hansetag / iceberg-catalog

A Rust implementation of the Iceberg REST Catalog specification.
Apache License 2.0
64 stars 5 forks source link

Export Prometheus metrics #87

Open twuebi opened 3 weeks ago

twuebi commented 3 weeks ago
corleyma commented 3 weeks ago

In general, it would be great to have a more comprehensive plan for instrumentation. Exposing prometheus metrics with RED (request rates, error rates, durations) for each endpoint is a good start, but it would be great to (optionally) also expose additional per warehouse/namespace/table metrics (think count of tables with warehouse and namespace labels, or datafile counts with table and namespace labels, etc)... or at least have an easy path for extending instrumentation without forking.

c-thiel commented 3 weeks ago

@corleyma thanks for your Feedback. To be honest, I am a bit hesistant to expose business metrics up to table or namespace level via a /metrics endpoint due to the high cardinality of the metrics this would create. Instead, we plan to expose those via an information_schema in the system namespace. We have already reserved the system namespace today so that users don't create it before we manage to populate it. Our biggest problem here is that we need to include a query engine such as datafusion with iceberg support in order to update those statistics in Iceberg Tables in the information schema.

Any thoughts on that approach?

corleyma commented 2 weeks ago

Exposing those via information_schema is mostly ok and I see the appeal, but it does mean that folks will need to put in more work if they actually want to ingest any of those metrics into their monitoring system. For normal databases that is mostly trivial (lots of db query prometheus exporters), but probably nothing off the shelf that can read iceberg (yet).