Closed gautam-nutalapati closed 2 years ago
Does your cortex end point end with /api/v1/remote_write?
Yes it does, I pass below to docker run:
-e PROMETHEUS_WRITE_ENDPOINT=https://aps-workspaces.us-east-1.amazonaws.com/workspaces/ws-xxx/api/v1/remote_write
I am trying to reproduce this via prometheusremotewriteexporter. This looks related to opentelemetry-collector-contrib#5578
If this issue is also in prometheusremotewriteexporter. we should create an issue in the contrib and link this issue. https://github.com/open-telemetry/opentelemetry-collector-contrib.
Can you turn on debug logs and post please.
Iogs of aws-otel-collector I created a opentelemetry-demo-app to reproduce the issue locally. Please let me know if this issue should be moved to opentelemetry-collector-contrib, I am not able to figure out where the root cause is coming from.
If the issue is in both the aws and non aws writer it should be in the contrib as this will no longer be an aws specific issue imo.
@sethAmazon Is it possible to write to Amazon managed prometheus using prometheusremotewriteexporter? If not, I cannot reproduce this issue with prometheusremotewriteexporter.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.
This issue was closed because it has been marked as stall for 30 days with no activity.
Describe the question Why does awsprometheusremotewrite exporter in aws-otel-collector throw below error:
Steps to reproduce if your question is related to an action
otel
awsprometheusremotewrite
http_client_duration_bucket
which is a histogram and above error in colelctor logs.What did you expect to see? As I do see metrics going to Grafana and traces to X-Ray, I expect no error message is printed.
Environment NA
Additional context It seems like this error is thrown by prometheus when trace information is not tied to metrics. e.g. metric with exemplar information from link: my_histogram_bucket{le="0.5"} 205 # {TraceID="b94cc547624c3062e17d743db422210e"} 0.175XXX 1.6XXX
Can this error be ignored? Or am I missing any configuration which is causing this error. I cannot find much info online about this. I don't need trace to be tied to metric.
OTEL-Collector configuration:
Update: I have been running the metric forwarded despite this error to test it out more. This error seems to have issue exporting just one bucket of all buckets. I configured aws otel collector to forward metrics to both prometheus and prometheusremotewriteexporter. In prometheus endpoint exposed by aws-otel-collector, I see below data for the histogram:
But in grafana, the histogram I plot looks as below,
As we can see, AMP is missing a bucket data. Related error shows data being dropped for this bucket:
In addition, below metrics which are published to prometheus but are being dropped when writing to AMP. These are default http metrics generated by aws otel java agent:
http_client_duration_bucket
andhttp_server_duration_bucket