Closed yeroc closed 2 months ago
Hope this doesn't come across as overly negative. It's pretty awesome to be able to spin up a product suite supporting metrics, traces and logs with a single command!
I think this might have to do with the collection names. Can you please double check that the generated collection names are matching, there's this temporary duality with adding/not-adding the unit, when the OTEL metrics are converted to Prometheus.
@grcevski Thanks for responding. Is "collection name" a Grafana or Prometheus term? I tried Googling but I'm failing to turn up exactly what you're referring to here. Is that equivalent to the metric name? Or something else?
Sorry I mean the Prometheus series name. I apologize for the confusion.
@grcevski If I'm understanding correctly, here's the list of metric series names populated:
"http_client_request_duration_seconds",
"http_server_request_duration_seconds",
"jvm_class_count",
"jvm_class_loaded_total",
"jvm_class_unloaded_total",
"jvm_cpu_count",
"jvm_cpu_recent_utilization_ratio",
"jvm_cpu_time_seconds_total",
"jvm_gc_duration_seconds",
"jvm_memory_committed_bytes",
"jvm_memory_limit_bytes",
"jvm_memory_used_after_last_gc_bytes",
"jvm_memory_used_bytes",
"jvm_thread_count",
"otelcol_exporter_queue_capacity",
"otelcol_exporter_queue_size",
"otelcol_exporter_send_failed_log_records_total",
"otelcol_exporter_send_failed_metric_points_total",
"otelcol_exporter_send_failed_spans_total",
"otelcol_exporter_sent_log_records_total",
"otelcol_exporter_sent_metric_points_total",
"otelcol_exporter_sent_spans_total",
"otelcol_http_server_duration_bucket",
"otelcol_http_server_duration_count",
"otelcol_http_server_duration_sum",
"otelcol_http_server_request_content_length_total",
"otelcol_http_server_response_content_length_total",
"otelcol_process_cpu_seconds_total",
"otelcol_process_memory_rss",
"otelcol_process_runtime_alloc_bytes_total",
"otelcol_process_runtime_heap_alloc_bytes",
"otelcol_process_runtime_total_sys_memory_bytes",
"otelcol_process_uptime_total",
"otelcol_processor_batch_batch_send_size_bucket",
"otelcol_processor_batch_batch_send_size_count",
"otelcol_processor_batch_batch_send_size_sum",
"otelcol_processor_batch_metadata_cardinality",
"otelcol_processor_batch_timeout_trigger_send_total",
"otelcol_receiver_accepted_log_records_total",
"otelcol_receiver_accepted_metric_points_total",
"otelcol_receiver_accepted_spans_total",
"otelcol_receiver_refused_log_records_total",
"otelcol_receiver_refused_metric_points_total",
"otelcol_receiver_refused_spans_total",
"otlp_exporter_exported_total",
"otlp_exporter_seen_total",
"processedLogs_total",
"processedSpans_total",
"queueSize_ratio",
"scrape_duration_seconds",
"scrape_samples_post_metric_relabeling",
"scrape_samples_scraped",
"scrape_series_added",
"target_info",
"traces_service_graph_request_client_seconds_bucket",
"traces_service_graph_request_client_seconds_count",
"traces_service_graph_request_client_seconds_sum",
"traces_service_graph_request_server_seconds_bucket",
"traces_service_graph_request_server_seconds_count",
"traces_service_graph_request_server_seconds_sum",
"traces_service_graph_request_total",
"up"
Not sure what I should be matching these up against? Is this related to this Grafana blog post and this OpenTelemetry Collector document that mentions Prometheus Normalization?
Hm, interesting, you don't see http_server_request_duration_seconds_bucket
and http_server_request_duration_seconds_count
?
Did you install the docker/grafana-dashboard-red-metrics-classic.json
or docker/grafana-dashboard-red-metrics-native.json
? Based on the metric series names I think you need to use docker/grafana-dashboard-red-metrics-native.json
.
@grcevski I'm using the docker container as published to Docker Hub via docker run -p 3000:3000 -p 4317:4317 -p 4318:4318 --rm -ti grafana/otel-lgtm
as per the Grafana Labs blog post. I haven't cloned this repo and customized anything, thus my comment above about the confusion between the two different predefined RED dashboards that are visible in the Grafana UI. All three dashboards show No Data.
Ah I see, we should possibly expand the documentation to include mention of the other dashboards. Are there any other dashboards available, we need to use the 'Native Prometheus Dashboard' for the metrics series names you have.
@grcevski Which other dashboards are you referring to? I see three dashboards by default:
It looks like these correspond to the dashboard definitions in the docker/grafana-dashboard-*.json
files in this repo. Are there others?
OK, great, so the "RED Metrics (native histogram)" should work if you have data in "http_server_request_duration_seconds". Is it also empty?
@grcevski I seem to have data:
but nothing shows up on the dashboard:
I'm trying to understand where the docs can be improved.
I've just followed the proposed steps for native histograms - please let me know where it didn't work:
export OTEL_EXPORTER_OTLP_METRICS_DEFAULT_HISTOGRAM_AGGREGATION=base2_exponential_bucket_histogram
run-example.sh
and generate-traffic.sh
@zeitlinger Sorry, if the intent is for this container to only work with the sample you provided you can go ahead and close this ticket. I'm feeding data in from my own application via the OpenTelemetry Java Agent. I still think it's confusing to have two RED dashboards but maybe that makes sense to OTel experts?
I still think it's confusing to have two RED dashboards
Only one of the RED dashboards can work, depending on how you send the data (controlled by OTEL_EXPORTER_OTLP_METRICS_DEFAULT_HISTOGRAM_AGGREGATION).
I'm happy to improve the docs if you have a suggestion :smile:
@zeitlinger For me none of the three dashboards are working with Java Agent 2.2.0 per notes above. Not sure what I'm doing wrong. I'd suggest adding docs for whatever is required for the JVM Dashboard to show information.
@zeitlinger For me none of the three dashboards are working with Java Agent 2.2.0 per notes above.
Are you using the included example app or your own? If the latter, can you point to a repo - or steps how to reproduce?
@zeitlinger My own application. The application isn't public so can't point you to a repo. I included the Java Agent config in ticket summary. Let me know what additional details you'd need. Like I said, even the JVM dashboard doesn't display anything but when I explore Metrics, Logs and Traces I do see information so I know the agent is properly activated and feeding data over.
closing as stale
I've been trying out this docker image as someone new to both the Grafana products and OpenTelemetry in general. I believe I'm your target audience. That said, I'm struggling to get any of the three sample Dashboards to populate with metrics using the OpenTelemetry Java Agent with my own application. I can confirm instrumentation is working because I'm able to see some metrics using Explore, I'm also seeing Traces and Logs populated as well but all the metrics dashboards remain obstinately blank.
Here are my settings for Java Agent 2.2.0:
The above settings are trying to get the JVM and RED (native histograms) dashboards to populate.
General feedback:
jvm_*
metrics are populated so I dunno?!