knative-extensions / eventing-kafka-broker

Alternate Kafka Broker implementation.
Apache License 2.0
186 stars 117 forks source link

Kafka Source does not expose documented metrics #3937

Closed pfilaretov42 closed 1 month ago

pfilaretov42 commented 5 months ago

Describe the bug

Hi!

As per Knative Eventing metrics documentation for eventing sources, there should be two metrics available:

If I deploy Ping Source, I can see pingsource_event_count metric available. However, I cannot find event_count metric for deployed Kafka Source.

Expected behavior

Kafka Source exposes event_count and retry_event_count metrics.

To Reproduce

@SpringBootApplication
public class KnativeDemoApplication {

    @Value("${TARGET:World}")
    String target;

    @RestController
    class HelloWorldController {
        @GetMapping("/")
        String hello(@RequestHeader(name = HttpHeaders.AUTHORIZATION, required = false) String auth) {
            return "Hello " + target + "!";
        }

        @PostMapping("/events")
        CloudEvent consumeEvent(
            @RequestBody CloudEvent event,
            @RequestHeader(name = HttpHeaders.AUTHORIZATION, required = false) String auth
        ) {
            return CloudEventBuilder.from(event)
                .withId(UUID.randomUUID().toString())
                .withSource(URI.create("https://spring.io/foos"))
                .withType("io.spring.event.Foo")
                .withData(event.getData().toBytes())
                .build();
        }
    }

    @Configuration
    static class CloudEventHandlerConfiguration implements WebMvcConfigurer {
        @Override
        public void configureMessageConverters(List<HttpMessageConverter<?>> converters) {
            converters.add(0, new CloudEventHttpMessageConverter());
        }

    }

    public static void main(String[] args) {
        SpringApplication.run(KnativeDemoApplication.class, args);
    }

}

service monitors for kafka source ```yaml apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: kafka-controller namespace: knative-eventing spec: namespaceSelector: matchNames: - knative-eventing selector: matchLabels: app: kafka-controller podMetricsEndpoints: - honorLabels: true path: /metrics port: metrics --- apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: kafka-source-dispatcher namespace: knative-eventing spec: namespaceSelector: matchNames: - knative-eventing selector: matchLabels: app: kafka-source-dispatcher podMetricsEndpoints: - honorLabels: true path: /metrics port: metrics --- apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: kafka-webhook namespace: knative-eventing spec: namespaceSelector: matchNames: - knative-eventing selector: matchLabels: app: kafka-webhook-eventing podMetricsEndpoints: - honorLabels: true path: /metrics port: metrics ```
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.14.1/serving-crds.yaml
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.14.1/serving-core.yaml
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.14.3/eventing-crds.yaml
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.14.3/eventing-core.yaml
kubectl apply -f https://github.com/knative-extensions/eventing-kafka-broker/releases/download/knative-v1.14.5/eventing-kafka-controller.yaml
kubectl apply -f https://github.com/knative-extensions/eventing-kafka-broker/releases/download/knative-v1.14.5/eventing-kafka-source.yaml

Additional context

Here are the kafka related metrics I can see in Prometheus instead of event_count:

metrics ``` "kafka_broker_controller_client_latency_bucket", "kafka_broker_controller_client_latency_count", "kafka_broker_controller_client_latency_sum", "kafka_broker_controller_client_results", "kafka_broker_controller_consumer_group_expected_replicas", "kafka_broker_controller_consumer_group_ready_replicas", "kafka_broker_controller_go_alloc", "kafka_broker_controller_go_bucket_hash_sys", "kafka_broker_controller_go_frees", "kafka_broker_controller_go_gc_cpu_fraction", "kafka_broker_controller_go_gc_sys", "kafka_broker_controller_go_heap_alloc", "kafka_broker_controller_go_heap_idle", "kafka_broker_controller_go_heap_in_use", "kafka_broker_controller_go_heap_objects", "kafka_broker_controller_go_heap_released", "kafka_broker_controller_go_heap_sys", "kafka_broker_controller_go_last_gc", "kafka_broker_controller_go_lookups", "kafka_broker_controller_go_mallocs", "kafka_broker_controller_go_mcache_in_use", "kafka_broker_controller_go_mcache_sys", "kafka_broker_controller_go_mspan_in_use", "kafka_broker_controller_go_mspan_sys", "kafka_broker_controller_go_next_gc", "kafka_broker_controller_go_num_forced_gc", "kafka_broker_controller_go_num_gc", "kafka_broker_controller_go_other_sys", "kafka_broker_controller_go_stack_in_use", "kafka_broker_controller_go_stack_sys", "kafka_broker_controller_go_sys", "kafka_broker_controller_go_total_alloc", "kafka_broker_controller_go_total_gc_pause_ns", "kafka_broker_controller_initialize_offsets_latency_bucket", "kafka_broker_controller_initialize_offsets_latency_count", "kafka_broker_controller_initialize_offsets_latency_sum", "kafka_broker_controller_reconcile_count", "kafka_broker_controller_reconcile_latency_bucket", "kafka_broker_controller_reconcile_latency_count", "kafka_broker_controller_reconcile_latency_sum", "kafka_broker_controller_schedule_latency_bucket", "kafka_broker_controller_schedule_latency_count", "kafka_broker_controller_schedule_latency_sum", "kafka_broker_controller_work_queue_depth", "kafka_broker_controller_workqueue_adds_total", "kafka_broker_controller_workqueue_depth", "kafka_broker_controller_workqueue_longest_running_processor_seconds_bucket", "kafka_broker_controller_workqueue_longest_running_processor_seconds_count", "kafka_broker_controller_workqueue_longest_running_processor_seconds_sum", "kafka_broker_controller_workqueue_queue_latency_seconds_bucket", "kafka_broker_controller_workqueue_queue_latency_seconds_count", "kafka_broker_controller_workqueue_queue_latency_seconds_sum", "kafka_broker_controller_workqueue_retries_total", "kafka_broker_controller_workqueue_unfinished_work_seconds_bucket", "kafka_broker_controller_workqueue_unfinished_work_seconds_count", "kafka_broker_controller_workqueue_unfinished_work_seconds_sum", "kafka_broker_controller_workqueue_work_duration_seconds_bucket", "kafka_broker_controller_workqueue_work_duration_seconds_count", "kafka_broker_controller_workqueue_work_duration_seconds_sum", "kafka_webhook_eventing_client_latency_bucket", "kafka_webhook_eventing_client_latency_count", "kafka_webhook_eventing_client_latency_sum", "kafka_webhook_eventing_client_results", "kafka_webhook_eventing_go_alloc", "kafka_webhook_eventing_go_bucket_hash_sys", "kafka_webhook_eventing_go_frees", "kafka_webhook_eventing_go_gc_cpu_fraction", "kafka_webhook_eventing_go_gc_sys", "kafka_webhook_eventing_go_heap_alloc", "kafka_webhook_eventing_go_heap_idle", "kafka_webhook_eventing_go_heap_in_use", "kafka_webhook_eventing_go_heap_objects", "kafka_webhook_eventing_go_heap_released", "kafka_webhook_eventing_go_heap_sys", "kafka_webhook_eventing_go_last_gc", "kafka_webhook_eventing_go_lookups", "kafka_webhook_eventing_go_mallocs", "kafka_webhook_eventing_go_mcache_in_use", "kafka_webhook_eventing_go_mcache_sys", "kafka_webhook_eventing_go_mspan_in_use", "kafka_webhook_eventing_go_mspan_sys", "kafka_webhook_eventing_go_next_gc", "kafka_webhook_eventing_go_num_forced_gc", "kafka_webhook_eventing_go_num_gc", "kafka_webhook_eventing_go_other_sys", "kafka_webhook_eventing_go_stack_in_use", "kafka_webhook_eventing_go_stack_sys", "kafka_webhook_eventing_go_sys", "kafka_webhook_eventing_go_total_alloc", "kafka_webhook_eventing_go_total_gc_pause_ns", "kafka_webhook_eventing_reconcile_count", "kafka_webhook_eventing_reconcile_latency_bucket", "kafka_webhook_eventing_reconcile_latency_count", "kafka_webhook_eventing_reconcile_latency_sum", "kafka_webhook_eventing_request_count", "kafka_webhook_eventing_request_latencies_bucket", "kafka_webhook_eventing_request_latencies_count", "kafka_webhook_eventing_request_latencies_sum", "kafka_webhook_eventing_work_queue_depth", "kafka_webhook_eventing_workqueue_adds_total", "kafka_webhook_eventing_workqueue_depth", "kafka_webhook_eventing_workqueue_longest_running_processor_seconds_bucket", "kafka_webhook_eventing_workqueue_longest_running_processor_seconds_count", "kafka_webhook_eventing_workqueue_longest_running_processor_seconds_sum", "kafka_webhook_eventing_workqueue_queue_latency_seconds_bucket", "kafka_webhook_eventing_workqueue_queue_latency_seconds_count", "kafka_webhook_eventing_workqueue_queue_latency_seconds_sum", "kafka_webhook_eventing_workqueue_retries_total", "kafka_webhook_eventing_workqueue_unfinished_work_seconds_bucket", "kafka_webhook_eventing_workqueue_unfinished_work_seconds_count", "kafka_webhook_eventing_workqueue_unfinished_work_seconds_sum", "kafka_webhook_eventing_workqueue_work_duration_seconds_bucket", "kafka_webhook_eventing_workqueue_work_duration_seconds_count", "kafka_webhook_eventing_workqueue_work_duration_seconds_sum" ```

However, I cannot find something that is similar to event_count.

pierDipi commented 5 months ago

/assign

pierDipi commented 5 months ago

@pfilaretov42 the Knative Kafka components metrics have slightly different metrics names due to a limitation in the Java libraries we use, here is the list https://docs.google.com/document/d/10aAwq5Sa6PpNy6W9wsAyZ9D73aZChJxm95kMJG9RR4Q/edit (you need to join the knative-users Google group), can you see them or are they still not available for you?

pfilaretov42 commented 5 months ago

Hi @pierDipi , I requested access to the document in google docs

pierDipi commented 5 months ago

Access granted, it's also given to users in knative-dev and knative-users Google groups

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.