crate / cratedb-prometheus-adapter

CrateDB Prometheus Adapter.
Apache License 2.0
60 stars 15 forks source link

Handle primary key conflicts better (upserts or ignore) #135

Closed proddata closed 1 month ago

proddata commented 7 months ago

Originated from community question: https://community.cratedb.com/t/prometheus-failed-to-write-data-into-cratedb-after-encountering-error-a-document-with-the-same-primary-key-exists-already/1724


We probably could add the option to use upserts or make it the even default approach.

Since the table definition is using primary keys we should change the insert statement

https://github.com/crate/cratedb-prometheus-adapter/blob/dc5c40f9b87ace1d543bd6cca0d6500884590c49/crate.go#L16

to either ignore conflicts:

INSERT INTO metrics ("labels", "labels_hash", "timestamp", "value", "valueRaw") VALUES ($1, $2, $3, $4, $5)
ON CONFLICT ("timestamp", "labels_hash", "day__generated")
DO NOTHING;

or handle upserts:

INSERT INTO metrics ("labels", "labels_hash", "timestamp", "value", "valueRaw") VALUES ($1, $2, $3, $4, $5)
ON CONFLICT ("timestamp", "labels_hash", "day__generated")
DO UPDATE SET
   "value" = excluded."value",
   "valueRaw" = excluded."valueRaw";

Potentially this could be configurable (to do either or).

surister commented 7 months ago

Maybe adding an environment variable ON_CONFLICT with values ignore and upsert and make upsert default?

docker run -it --rm --publish=9268:9268 -e ON_CONFLICT=ignore ghcr.io/crate/cratedb-prometheus-adapter

amotl commented 7 months ago

@proddata: Thanks for reporting this, and for your suggestions. I agree with your proposal.

@surister: I also try to approach improvements to this program each time like this again, but then, reality strikes: The program currently doesn't do much with environment variables, opposed to what one would expect for a cloud-native program. The configuration is mostly handled on behalf of configuration file settings.

Because it is possible to configure multiple CrateDB endpoints there, i.e. the program implements a flavor of multi-tenancy on that dimension, it is mostly not applicable to use settings that would apply globally.