confluentinc / schema-registry

Confluent Schema Registry for Kafka
https://docs.confluent.io/current/schema-registry/docs/index.html
Other
2.2k stars 1.11k forks source link

Differentiate between client/server errors #2167

Open bdbene opened 2 years ago

bdbene commented 2 years ago

The JMX metric that currently tracks errors doesn't differentiate between client errors (400s) and server side errors (500). jmx.kafka.schema.registry.request_error_rate

It would be very helpful from an operational point of view to be able to distinguish between those types of errors. For example if there were two metrics along the lines of:

Then anyone maintaining a Schema Registry instance would be able to set up monitoring and alerting on schema-registry errors, and not have to worry about false alarms every time a client uses the service incorrectly.

OneCricketeer commented 2 years ago

If the code were a tag, you can use regex promql matches on ^4.. and ^5..

pkuhlu commented 1 month ago

+1. @rayokota wonder if your team could help with this request?

pkuhlu commented 1 month ago

@Claimundefine wonder if your team could help with this request?

pkuhlu commented 1 month ago

To conclude after looking into the code, SR jmx actually has the http_status_code needed for the purpose from the underlying jersey server. @bdbene @OneCricketeer

image