confluentinc / schema-registry

Confluent Schema Registry for Kafka
https://docs.confluent.io/current/schema-registry/docs/index.html
Other
2.17k stars 1.11k forks source link

Certain JMX metrics fail to report accurate values #1379

Open siobhansabino opened 4 years ago

siobhansabino commented 4 years ago

Note: We are running the latest Schema Registry version, using the Confluent Docker image.

Following the documentation provided to obtain JMX metrics, we have found the following metrics only report zero regardless of reality:

However we do have the following metrics reporting values as expected:

(We cannot be certain as to the state of subjects.versions.register.request-error-rate thus have not included it on the list.)

We caught this inaccuracy due to clients interacting with the Schema Registry receiving rejections on compatibility checks which the metrics does not even show requests or responses for.

Please advise if we can provide any further information about this issue.

OneCricketeer commented 4 years ago

I don't think clients actually hit /compatibility endpoint during registration

You'd have to expect some subset of those exceptions are coming from the /subjects/{name}/versions endpoint, which validates the request payload internally using the same methods behind /compatibility

siobhansabino commented 4 years ago

This is not for during registration, this is for normal producing where the schema's compatibility is checked before any message is sent, so we'd expect the overwhelming of calls to the Schema Registry to be compatibility. As noted, we see the clients noting their requests and responses to the Schema Registry around this endpoint but the Schema Registry itself does not report any of that.

OneCricketeer commented 4 years ago

this is for normal producing where the schema's compatibility is checked before any message is sent

Can you please point me to the serializer line that calls the compatibility method? I'm only seeing register :confused:

https://github.com/confluentinc/schema-registry/blob/master/schema-serializer/src/main/java/io/confluent/kafka/serializers/AbstractKafkaSchemaSerDe.java

siobhansabino commented 4 years ago

We do not have any JVM producers so instead directly call the Schema Registry compatibility API ourselves.

OneCricketeer commented 4 years ago

I see. Sorry, should have clarified "normal producing" since no other Confluent provided serializer calls the compatibility endpoint, either, to my knowledge.

Regarding the metrics. They are setup all the same way using annotations, so they should work

https://github.com/confluentinc/schema-registry/blob/master/core/src/main/java/io/confluent/kafka/schemaregistry/rest/resources/CompatibilityResource.java#L83

yesemsanthoshkumar commented 2 years ago

@siobhansabino Did you find a fix for the same? We are on confluentinc/cp-schema-registry:6.2.0 docker image and facing the same issue.

OneCricketeer commented 2 years ago

If you manually call the /compatibility API endpoints, do they remain zero in the metrics?

yesemsanthoshkumar commented 2 years ago

@OneCricketeer We don't have any custom producers yet. Hence, not sure on the compatibility API.

But we do have debezium and hudi interacting with schema registry. Even when onboarding new tables in debezium (schema is registered to schema registry here) and Hudi jobs (schema is read from schema registry here), the request error rate remains zero all the time. And we do see failures 5xx responses while interacting with schema registry in our job logs. The same is not being reflected in our JMX metrics for Schema registry. Even the 2xx responses are not being reflected in the JMX metrics. This behaviour is consistent among all the endpoint metrics.