knative-extensions / eventing-kafka-broker

Alternate Kafka Broker implementation.
Apache License 2.0
166 stars 110 forks source link

Kafka Source does not expose documented metrics #3937

Open pfilaretov42 opened 2 weeks ago

pfilaretov42 commented 2 weeks ago

Describe the bug

Hi!

As per Knative Eventing metrics documentation for eventing sources, there should be two metrics available:

If I deploy Ping Source, I can see pingsource_event_count metric available. However, I cannot find event_count metric for deployed Kafka Source.

Expected behavior

Kafka Source exposes event_count and retry_event_count metrics.

To Reproduce

@SpringBootApplication
public class KnativeDemoApplication {

    @Value("${TARGET:World}")
    String target;

    @RestController
    class HelloWorldController {
        @GetMapping("/")
        String hello(@RequestHeader(name = HttpHeaders.AUTHORIZATION, required = false) String auth) {
            return "Hello " + target + "!";
        }

        @PostMapping("/events")
        CloudEvent consumeEvent(
            @RequestBody CloudEvent event,
            @RequestHeader(name = HttpHeaders.AUTHORIZATION, required = false) String auth
        ) {
            return CloudEventBuilder.from(event)
                .withId(UUID.randomUUID().toString())
                .withSource(URI.create("https://spring.io/foos"))
                .withType("io.spring.event.Foo")
                .withData(event.getData().toBytes())
                .build();
        }
    }

    @Configuration
    static class CloudEventHandlerConfiguration implements WebMvcConfigurer {
        @Override
        public void configureMessageConverters(List<HttpMessageConverter<?>> converters) {
            converters.add(0, new CloudEventHttpMessageConverter());
        }

    }

    public static void main(String[] args) {
        SpringApplication.run(KnativeDemoApplication.class, args);
    }

}

service monitors for kafka source ```yaml apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: kafka-controller namespace: knative-eventing spec: namespaceSelector: matchNames: - knative-eventing selector: matchLabels: app: kafka-controller podMetricsEndpoints: - honorLabels: true path: /metrics port: metrics --- apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: kafka-source-dispatcher namespace: knative-eventing spec: namespaceSelector: matchNames: - knative-eventing selector: matchLabels: app: kafka-source-dispatcher podMetricsEndpoints: - honorLabels: true path: /metrics port: metrics --- apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: kafka-webhook namespace: knative-eventing spec: namespaceSelector: matchNames: - knative-eventing selector: matchLabels: app: kafka-webhook-eventing podMetricsEndpoints: - honorLabels: true path: /metrics port: metrics ```
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.14.1/serving-crds.yaml
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.14.1/serving-core.yaml
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.14.3/eventing-crds.yaml
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.14.3/eventing-core.yaml
kubectl apply -f https://github.com/knative-extensions/eventing-kafka-broker/releases/download/knative-v1.14.5/eventing-kafka-controller.yaml
kubectl apply -f https://github.com/knative-extensions/eventing-kafka-broker/releases/download/knative-v1.14.5/eventing-kafka-source.yaml

Additional context

Here are the kafka related metrics I can see in Prometheus instead of event_count:

metrics ``` "kafka_broker_controller_client_latency_bucket", "kafka_broker_controller_client_latency_count", "kafka_broker_controller_client_latency_sum", "kafka_broker_controller_client_results", "kafka_broker_controller_consumer_group_expected_replicas", "kafka_broker_controller_consumer_group_ready_replicas", "kafka_broker_controller_go_alloc", "kafka_broker_controller_go_bucket_hash_sys", "kafka_broker_controller_go_frees", "kafka_broker_controller_go_gc_cpu_fraction", "kafka_broker_controller_go_gc_sys", "kafka_broker_controller_go_heap_alloc", "kafka_broker_controller_go_heap_idle", "kafka_broker_controller_go_heap_in_use", "kafka_broker_controller_go_heap_objects", "kafka_broker_controller_go_heap_released", "kafka_broker_controller_go_heap_sys", "kafka_broker_controller_go_last_gc", "kafka_broker_controller_go_lookups", "kafka_broker_controller_go_mallocs", "kafka_broker_controller_go_mcache_in_use", "kafka_broker_controller_go_mcache_sys", "kafka_broker_controller_go_mspan_in_use", "kafka_broker_controller_go_mspan_sys", "kafka_broker_controller_go_next_gc", "kafka_broker_controller_go_num_forced_gc", "kafka_broker_controller_go_num_gc", "kafka_broker_controller_go_other_sys", "kafka_broker_controller_go_stack_in_use", "kafka_broker_controller_go_stack_sys", "kafka_broker_controller_go_sys", "kafka_broker_controller_go_total_alloc", "kafka_broker_controller_go_total_gc_pause_ns", "kafka_broker_controller_initialize_offsets_latency_bucket", "kafka_broker_controller_initialize_offsets_latency_count", "kafka_broker_controller_initialize_offsets_latency_sum", "kafka_broker_controller_reconcile_count", "kafka_broker_controller_reconcile_latency_bucket", "kafka_broker_controller_reconcile_latency_count", "kafka_broker_controller_reconcile_latency_sum", "kafka_broker_controller_schedule_latency_bucket", "kafka_broker_controller_schedule_latency_count", "kafka_broker_controller_schedule_latency_sum", "kafka_broker_controller_work_queue_depth", "kafka_broker_controller_workqueue_adds_total", "kafka_broker_controller_workqueue_depth", "kafka_broker_controller_workqueue_longest_running_processor_seconds_bucket", "kafka_broker_controller_workqueue_longest_running_processor_seconds_count", "kafka_broker_controller_workqueue_longest_running_processor_seconds_sum", "kafka_broker_controller_workqueue_queue_latency_seconds_bucket", "kafka_broker_controller_workqueue_queue_latency_seconds_count", "kafka_broker_controller_workqueue_queue_latency_seconds_sum", "kafka_broker_controller_workqueue_retries_total", "kafka_broker_controller_workqueue_unfinished_work_seconds_bucket", "kafka_broker_controller_workqueue_unfinished_work_seconds_count", "kafka_broker_controller_workqueue_unfinished_work_seconds_sum", "kafka_broker_controller_workqueue_work_duration_seconds_bucket", "kafka_broker_controller_workqueue_work_duration_seconds_count", "kafka_broker_controller_workqueue_work_duration_seconds_sum", "kafka_webhook_eventing_client_latency_bucket", "kafka_webhook_eventing_client_latency_count", "kafka_webhook_eventing_client_latency_sum", "kafka_webhook_eventing_client_results", "kafka_webhook_eventing_go_alloc", "kafka_webhook_eventing_go_bucket_hash_sys", "kafka_webhook_eventing_go_frees", "kafka_webhook_eventing_go_gc_cpu_fraction", "kafka_webhook_eventing_go_gc_sys", "kafka_webhook_eventing_go_heap_alloc", "kafka_webhook_eventing_go_heap_idle", "kafka_webhook_eventing_go_heap_in_use", "kafka_webhook_eventing_go_heap_objects", "kafka_webhook_eventing_go_heap_released", "kafka_webhook_eventing_go_heap_sys", "kafka_webhook_eventing_go_last_gc", "kafka_webhook_eventing_go_lookups", "kafka_webhook_eventing_go_mallocs", "kafka_webhook_eventing_go_mcache_in_use", "kafka_webhook_eventing_go_mcache_sys", "kafka_webhook_eventing_go_mspan_in_use", "kafka_webhook_eventing_go_mspan_sys", "kafka_webhook_eventing_go_next_gc", "kafka_webhook_eventing_go_num_forced_gc", "kafka_webhook_eventing_go_num_gc", "kafka_webhook_eventing_go_other_sys", "kafka_webhook_eventing_go_stack_in_use", "kafka_webhook_eventing_go_stack_sys", "kafka_webhook_eventing_go_sys", "kafka_webhook_eventing_go_total_alloc", "kafka_webhook_eventing_go_total_gc_pause_ns", "kafka_webhook_eventing_reconcile_count", "kafka_webhook_eventing_reconcile_latency_bucket", "kafka_webhook_eventing_reconcile_latency_count", "kafka_webhook_eventing_reconcile_latency_sum", "kafka_webhook_eventing_request_count", "kafka_webhook_eventing_request_latencies_bucket", "kafka_webhook_eventing_request_latencies_count", "kafka_webhook_eventing_request_latencies_sum", "kafka_webhook_eventing_work_queue_depth", "kafka_webhook_eventing_workqueue_adds_total", "kafka_webhook_eventing_workqueue_depth", "kafka_webhook_eventing_workqueue_longest_running_processor_seconds_bucket", "kafka_webhook_eventing_workqueue_longest_running_processor_seconds_count", "kafka_webhook_eventing_workqueue_longest_running_processor_seconds_sum", "kafka_webhook_eventing_workqueue_queue_latency_seconds_bucket", "kafka_webhook_eventing_workqueue_queue_latency_seconds_count", "kafka_webhook_eventing_workqueue_queue_latency_seconds_sum", "kafka_webhook_eventing_workqueue_retries_total", "kafka_webhook_eventing_workqueue_unfinished_work_seconds_bucket", "kafka_webhook_eventing_workqueue_unfinished_work_seconds_count", "kafka_webhook_eventing_workqueue_unfinished_work_seconds_sum", "kafka_webhook_eventing_workqueue_work_duration_seconds_bucket", "kafka_webhook_eventing_workqueue_work_duration_seconds_count", "kafka_webhook_eventing_workqueue_work_duration_seconds_sum" ```

However, I cannot find something that is similar to event_count.

pierDipi commented 2 weeks ago

/assign

pierDipi commented 1 week ago

@pfilaretov42 the Knative Kafka components metrics have slightly different metrics names due to a limitation in the Java libraries we use, here is the list https://docs.google.com/document/d/10aAwq5Sa6PpNy6W9wsAyZ9D73aZChJxm95kMJG9RR4Q/edit (you need to join the knative-users Google group), can you see them or are they still not available for you?

pfilaretov42 commented 1 week ago

Hi @pierDipi , I requested access to the document in google docs

pierDipi commented 1 week ago

Access granted, it's also given to users in knative-dev and knative-users Google groups