rabbitmq / rabbitmq-amqp-dotnet-client

RabbitMQ client for AMQP 1.0 protocol
Other
11 stars 2 forks source link

Implement metrics ( See `System.Diagnostics.DiagnosticSource`) #58

Closed Gsantomaggio closed 3 days ago

Gsantomaggio commented 2 months ago

Is your feature request related to a problem? Please describe.

Implement client metrics

Describe the solution you'd like

See System.Diagnostics.DiagnosticSource

Describe alternatives you've considered

No response

Additional context

No response

Gsantomaggio commented 2 months ago

See also: https://rabbitmq.github.io/rabbitmq-amqp-java-client/snapshot/htmlsingle/#metrics-collection

aygalinc commented 3 weeks ago

Do you need help on this subject ? I Can try a basic implem

Gsantomaggio commented 3 weeks ago

@aygalinc Thank you. It would be great if you could work on it.

The idea is to provide the same Java client metrics. You can have a look here: https://github.com/rabbitmq/rabbitmq-amqp-java-client/blob/main/src/test/java/com/rabbitmq/client/amqp/perf/AmqpPerfTest.java#L51

The idea is to create an interface for the metrics; see this branch: https://github.com/rabbitmq/rabbitmq-amqp-dotnet-client/blob/metrics/RabbitMQ.AMQP.Client/IMetricsCollector.cs

Then, implement the class based on the interface. By default, it should work like Java with a class NoOpMetricsCollector

About Prometheus:

We don't want to add other dependencies on this client, so each implementation should be done in another project with the correct implementation.

I hope it is clear.

if you have questions: You can find me on https://www.rabbitmq.com/blog/2023/04/04/announcing-rabbitmq-community-discord-server user: gsantomaggio

.

Thank you

aygalinc commented 3 weeks ago

You do not prefer to base your work on Otel metric semconv to define the set of needed metrics ? https://opentelemetry.io/docs/specs/semconv/messaging/messaging-metrics/

Gsantomaggio commented 3 weeks ago

It is ok with me. In this client btw we'd need to have these metrics:

# HELP rabbitmq_amqp_connections  
# TYPE rabbitmq_amqp_connections gauge
rabbitmq_amqp_connections 1.0
# HELP rabbitmq_amqp_consumed_total  
# TYPE rabbitmq_amqp_consumed_total counter
rabbitmq_amqp_consumed_total 30114.0
# HELP rabbitmq_amqp_consumed_accepted_total  
# TYPE rabbitmq_amqp_consumed_accepted_total counter
rabbitmq_amqp_consumed_accepted_total 30114.0
# HELP rabbitmq_amqp_consumed_discarded_total  
# TYPE rabbitmq_amqp_consumed_discarded_total counter
rabbitmq_amqp_consumed_discarded_total 0.0
# HELP rabbitmq_amqp_consumed_requeued_total  
# TYPE rabbitmq_amqp_consumed_requeued_total counter
rabbitmq_amqp_consumed_requeued_total 0.0
# HELP rabbitmq_amqp_consumers  
# TYPE rabbitmq_amqp_consumers gauge
rabbitmq_amqp_consumers 1.0
# HELP rabbitmq_amqp_latency_seconds message latency
# TYPE rabbitmq_amqp_latency_seconds summary
rabbitmq_amqp_latency_seconds{quantile="0.5"} 0.009404416
rabbitmq_amqp_latency_seconds{quantile="0.75"} 0.017793024
rabbitmq_amqp_latency_seconds{quantile="0.95"} 0.079659008
rabbitmq_amqp_latency_seconds{quantile="0.99"} 0.121602048
rabbitmq_amqp_latency_seconds_count 30114
rabbitmq_amqp_latency_seconds_sum 268.778
# HELP rabbitmq_amqp_latency_seconds_max message latency
# TYPE rabbitmq_amqp_latency_seconds_max gauge
rabbitmq_amqp_latency_seconds_max 0.143
# HELP rabbitmq_amqp_published_total  
# TYPE rabbitmq_amqp_published_total counter
rabbitmq_amqp_published_total 30295.0
# HELP rabbitmq_amqp_published_accepted_total  
# TYPE rabbitmq_amqp_published_accepted_total counter
rabbitmq_amqp_published_accepted_total 30186.0
# HELP rabbitmq_amqp_published_accepted_latency_seconds published message disposition latency
# TYPE rabbitmq_amqp_published_accepted_latency_seconds summary
rabbitmq_amqp_published_accepted_latency_seconds{quantile="0.5"} 0.00704512
rabbitmq_amqp_published_accepted_latency_seconds{quantile="0.75"} 0.012025856
rabbitmq_amqp_published_accepted_latency_seconds{quantile="0.95"} 0.064978944
rabbitmq_amqp_published_accepted_latency_seconds{quantile="0.99"} 0.083853312
rabbitmq_amqp_published_accepted_latency_seconds_count 30234
rabbitmq_amqp_published_accepted_latency_seconds_sum 216.127
# HELP rabbitmq_amqp_published_accepted_latency_seconds_max published message disposition latency
# TYPE rabbitmq_amqp_published_accepted_latency_seconds_max gauge
rabbitmq_amqp_published_accepted_latency_seconds_max 0.086
# HELP rabbitmq_amqp_published_rejected_total  
# TYPE rabbitmq_amqp_published_rejected_total counter
rabbitmq_amqp_published_rejected_total 0.0
# HELP rabbitmq_amqp_published_released_total  
# TYPE rabbitmq_amqp_published_released_total counter
rabbitmq_amqp_published_released_total 0.0
# HELP rabbitmq_amqp_publishers  
# TYPE rabbitmq_amqp_publishers gauge
rabbitmq_amqp_publishers 1.0
aygalinc commented 3 weeks ago

Do you want OTEL metric or the custom one you describe : becasue for me, the one you describe for ex :

# HELP rabbitmq_amqp_published_total  
# TYPE rabbitmq_amqp_published_total counter
rabbitmq_amqp_published_total 30295.0
# HELP rabbitmq_amqp_published_accepted_total  
# TYPE rabbitmq_amqp_published_accepted_total counter

In otel semanctic convention should be translated to only one metric :

# HELP messaging_client_sent_messages_total  
# TYPE messaging_client_sent_messages_total counter
messaging_client_sent_messages_total

with the absence or thee presence of the tag error.type if the message is published well or not. And the tag messaging.system will be set to rabbitmq .

lukebakken commented 3 weeks ago

@Gsantomaggio @aygalinc please refer to the AMQP 0.9.1 .NET client issues and pull requests, as there has been a LOT of discussion with Microsoft on OTel related naming

https://github.com/rabbitmq/rabbitmq-dotnet-client/pull/1528 https://github.com/rabbitmq/rabbitmq-dotnet-client/pull/1261 https://github.com/rabbitmq/rabbitmq-dotnet-client/issues/1715

Gsantomaggio commented 3 weeks ago

Thank you @lukebakken .

aygalinc commented 3 weeks ago

Hey ! Thanks for the pointers: issues and PR refer to tracing.(Span attribute converge with metrics tags to have a common base of understanding) The fact is that @Gsantomaggio points me a set of metric that is quite different from the open telemetry one in term of naming and tags. For me it's weird to adopt Otel on the tracing side and go for a different convention on the metrics side. But I also understand the fact that other clients have been developed and you want a uniform ecosystem.

Gsantomaggio commented 3 weeks ago

Sorry for the confusion. My fault. Please follow the @lukebakken suggestions. He is more expert than me in this area.

Thank you

lukebakken commented 3 weeks ago

For me it's weird to adopt Otel on the tracing side

That's not what I'm suggesting, just that there has been so much discussion around naming things that it's probably relevant to naming things for metrics.