Closed templarfelix closed 2 years ago
Thanks for the report. Are you able to reproduce this consistently? If so, can you see if you are able to reproduce after adding --feature-gates=+receiver.prometheus.OTLPDirect
to the collector's CLI flags? That will enable a different metrics appender that is becoming the default in the release that is about to happen. I'm interested whether this also happens with that pipeline.
It looks like what is happening is that the first scrape for this metric had a stale value set which put the start time adjustment logic in a bad state. I think we can add some checks to compensate for this, but I'd like to know better how this state arose and whether it is unique to the outgoing pipeline or also an issue with the new implementation.
@Aneurysm9 i try and receive same error:
otc-container: │
│ Container ID: docker://11f907302e89f04b0e33731d94bec87fc2d3369d8179d4faf0c9b6ded7575629 │
│ Image: otel/opentelemetry-collector-contrib:0.42.0 │
│ Image ID: docker-pullable://otel/opentelemetry-collector-contrib@sha256:d52ea80e39430e778705a3a7a2b115c4ef812073e704422fe974bcc2f9d2c60a │
│ Port: <none> │
│ Host Port: <none> │
│ Args: │
│ --feature-gates=-receiver.prometheus.OTLPDirect │
│ --metrics-level=detailed │
│ --config=/conf/collector.yaml
The argument is backward. It must be --feature-gates=+receiver.prometheus.OTLPDirect
. Using --feature-gates=-receiver.prometheus.OTLPDirect
as you have there is the default state in v0.42.0
and changed nothing.
same problem
│ panic: runtime error: index out of range [-1] │
│ │
│ goroutine 121 [running]: │
│ github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusremotewriteexporter.addSingleHistogramDataPoint({0xc000c72360}, {0x1bf}, {0x0}, {0x0, 0x0}, 0xc2254139a0, 0x8) │
│ github.com/open-telemetry/opentelemetry-collector-contrib/exporter/prometheusremotewriteexporter@v0.42.0/helper.go:397 +0x8bd
Containers: │
│ otc-container: │
│ Container ID: docker://21b690856d51bf8f15a276c90cb8222f149e2b4b3dc771dedf40ae4746c02d26 │
│ Image: otel/opentelemetry-collector-contrib:0.42.0 │
│ Image ID: docker-pullable://otel/opentelemetry-collector-contrib@sha256:d52ea80e39430e778705a3a7a2b115c4ef812073e704422fe974bcc2f9d2c60a │
│ Port: <none> │
│ Host Port: <none> │
│ Args: │
│ --feature-gates=+receiver.prometheus.OTLPDirect │
│ --metrics-level=detailed │
│ --config=/conf/collector.yaml
That is a different problem. This time it is in the PRW exporter, not the receiver. Here it seems that the stale marker has been correctly handled by the receiver but the exporter is having an issue reconstructing the le=+Inf
bucket. I should be able to get a fix for that prepared fairly quickly.
Looks like there's already a PR looking to fix the PRW exporter issue.
thanks for help @Aneurysm9
Do you know how long before this is in the next release? I am encountering this on one of our deployments.
Do you know how long before this is in the next release? I am encountering this on one of our deployments.
We attempt to ship a release every two weeks. I would expect that this will be included in the v0.44.0
release that should happen next week.
Thanks @Aneurysm9 😄
Describe the bug on run with many prometheus scrabes cause memory leaks
Steps to reproduce run on k8s with many prometheus scrabes.
What version did you use? Version: v0.42.0
What config did you use? Config:
Environment OS: (e.g., "Ubuntu 20.04") Compiler(if manually compiled): (e.g., "go 14.2")
Additional context