Closed rakyll closed 3 years ago
This continues to fail even with the up
metric present:
=== CONT TestRemoteWrite/otelcollector/RepeatedLabels
labels.go:58:
Error Trace: labels.go:58
main_test.go:105
main_test.go:69
Error: Should be true
Test: TestRemoteWrite/otelcollector/RepeatedLabels
Messages: found samples for test{a="1"}, none expected
Please assign this issue to me
All yours @odeke-em
I think the Prometheus Remote Write compliance tests are wrong. I have written a sample test and used Prometheus with it and Prometheus is able to successfully get the datapoints.
package main
import (
"context"
"fmt"
"net/http"
"os"
"os/signal"
"sync/atomic"
"time"
)
func main() {
ctx, cancel := context.WithCancel(context.Background())
i := uint64(1)
go func() {
for {
select {
case <-ctx.Done():
return
case <-time.After(27 * time.Second):
atomic.AddUint64(&i, 1)
}
}
}()
handler := func(rw http.ResponseWriter, req *http.Request) {
println("Scraped", time.Now().Unix())
fmt.Fprintf(rw, `
# HELP test A gauge
# TYPE test gauge
test{a="1",a="1"} %.1f`, float64(atomic.LoadUint64(&i)))
}
sigCh := make(chan os.Signal, 1)
signal.Notify(sigCh, os.Interrupt)
defer cancel()
go func() {
if err := http.ListenAndServe(":8855", http.HandlerFunc(handler)); err != nil {
panic(err)
}
}()
<-sigCh
}
Run it by go run main.go
global:
scrape_interval: 10s
external_labels:
monitor: 'otlc'
remote_write:
scrape_configs:
- job_name: 'otlc'
scrape_interval: 10s
honor_labels: true
static_configs:
- targets: ['localhost:8855']
Run it by prometheus --config.file=prom.yaml --log.level=debug
Prometheus itself doesn't reject these repeated labels, so why are the compliance tests rejecting it? Kindly cc-ing @tomwilkie @brian-brazil @rakyll @Aneurysm9 @alolita.
Prometheus itself doesn't reject these repeated labels
If that's the case, then that's a bug in Prometheus. Are you sure you're using the latest version? We had to fix this at one point.
Your Prometheus screenshot seems from a release before Prometheus 2.14, because it is the "old" UI without a link to the new UI. That link was added in 2.14. I guess you are using a Prometheus release that was released before 2019-11-11.
Duplicate labels are rejected since 2.16.0, released on 2020-02-13.
Oh I see I see! Thank you Julien and Brian for the correction. I had installed Prometheus by a go get a while ago and hadn’t updated. Let me update and try again.
On Wed, Jun 9, 2021 at 2:39 PM Julien Pivotto @.***> wrote:
Your Prometheus screenshot seems from a release before Prometheus 2.14, because it is the "old" UI without a link to the new UI. That link was added in 2.14. I guess you are using a Prometheus release that was released before 2019-11-11.
Duplicate labels are rejected since 2.16.0, released on 2020-02-13.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/open-telemetry/wg-prometheus/issues/44#issuecomment-857621823, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFL3V7XTZOWKR7LE6B4RRLTR5HFJANCNFSM43HFKZAA .
Yup, with the latest, I now see the issue
level=debug ts=2021-06-09T12:45:39.268Z caller=scrape.go:1459 component="scrape manager" scrape_pool=otlc target=http://localhost:8855/metrics msg="Unexpected error" series="test{a=\"1\",a=\"1\"}" err="label name \"a\" is not unique: invalid sample"
level=debug ts=2021-06-09T12:45:39.268Z caller=scrape.go:1247 component="scrape manager" scrape_pool=otlc target=http://localhost:8855/metrics msg="Append failed" err="label name \"a\" is not unique: invalid sample"
Thank you Brian and Julien!
This issue was a red herring, the problem is actually in the Prometheus receiver and I've filed https://github.com/open-telemetry/opentelemetry-collector/issues/3407 and am mailing a PR soon after the Prometheus Working Group meeting that's currently ongoing.
The OpenTelemetry collector is not passing https://github.com/prometheus/compliance/tree/main/remote_write TestRemoteWrite/otelcollector/RepeatedLabels.