Open chernomor opened 4 years ago
Hi @chernomor ,
Thanks for report! But you didn't mentioned what's running on localhost:2003 - what software, which version and with which config. It's quite hard to reproduce without that information.
Thanks!
Hi @deniszh, This is go-carbon (commit https://github.com/lomik/go-carbon/commit/7b66f400678ab5071a32b6fbe66f4c79bd41d43c). Also I import current master from go-whisper to https://github.com/lomik/go-carbon/tree/master/persister - no changes.
@chernomor go-carbon's config (I mean go-carbon.conf
) could be important as well as there are several ways how you can configure it.
go-carbon.conf
[common]
user = "carbon"
graph-prefix = "carbon.agents.{host}"
metric-endpoint = "local"
metric-interval = "1m0s"
max-cpu = 4
[whisper]
data-dir = "/var/lib/graphite/whisper"
schemas-file = "/etc/go-carbon/storage-schemas.conf"
aggregation-file = "/etc/go-carbon/storage-aggregation.conf"
workers = 8
max-updates-per-second = 0
max-creates-per-second = 0
hard-max-creates-per-second = false
sparse-create = false
flock = true
enabled = true
hash-filenames = true
compressed = true
remove-empty-file = false
[cache]
max-size = 10
write-strategy = "max"
[udp]
listen = ":2003"
enabled = true
buffer-size = 0
[tcp]
listen = ":2003"
enabled = true
buffer-size = 0
[pickle]
listen = ":2004"
max-message-size = 67108864
enabled = true
buffer-size = 0
[carbonlink]
listen = "127.0.0.1:7002"
enabled = true
read-timeout = "30s"
[grpc]
listen = "127.0.0.1:7003"
enabled = true
[tags]
enabled = false
tagdb-url = "http://127.0.0.1:8000"
tagdb-chunk-size = 32
tagdb-update-interval = 100
local-dir = "/var/lib/graphite/tagging/"
tagdb-timeout = "1s"
[carbonserver]
listen = ":8080"
enabled = true
buckets = 10
metrics-as-counters = false
read-timeout = "60s"
write-timeout = "60s"
query-cache-enabled = false
query-cache-size-mb = 0
find-cache-enabled = false
trigram-index = true
scan-frequency = "5m0s"
trie-index = false
max-globs = 100
fail-on-max-globs = false
max-metrics-globbed = 30000
max-metrics-rendered = 1000
graphite-web-10-strict-mode = true
internal-stats-dir = ""
stats-percentiles = [99, 98, 95, 75, 50]
[dump]
enabled = false
path = "/var/lib/graphite/dump/"
restore-per-second = 0
[pprof]
listen = "localhost:7007"
enabled = false
[[logging]]
logger = ""
file = "/var/log/go-carbon/go-carbon.log"
level = "info"
encoding = "mixed"
encoding-time = "iso8601"
encoding-duration = "seconds"
storage-aggregation.conf
[test]
pattern = \.test$
xFilesFactor = 0
aggregationMethod = average
[default]
pattern = .*
xFilesFactor = 0.5
aggregationMethod = average
storage-schemas.conf
[test]
pattern = ^test
retentions = 1m:2d
[default]
pattern = .*
retentions = 60s:30d,1h:5y
Thanks! Key thing here is that you are using compressed whisper, and maybe @bom-d-van would be the best person to have a look at your issue.
@Civil yes, without compression this strange case (with repeated points) works fine :)
@chernomor thanks for the report! (also thanks @deniszh and @Civil for the discussion) Will take a look on the issue soon.
@chernomor the issue is some what an edge case in the compressed whisper. In the current design, compressed whisper doesn't support data rewrite like standard whisper file. The check in the code that we had at the moment only guards against files with at least 2 levels of retentions policy (like 1s:2d,1m:30d).
For single level retention policy (like 1m:2d
), it happily saves the data but I think it might have some issues in the read path. I will make a fix to address it. We have two solutions for this case:
I haven't decide which way to go. Let me know if you think one is better than another.
@bom-d-van, with second solution cwhisper will save only points with larger timestamp than last saved? With first solution it's not guaranted what all points will be saved at whole retention period, is it true?
@bom-d-van I think, discard data is most reliable way - because old points will not be lost (it's most important goal at my opinion). Rewrite may be done manualy by sending needed data as new metric and replace old file by the new.
TLDR: this script will fill database with one point when compression using:
I expects what repeatly sended points will be ignored or will update points in database.
How to reproduce.
storage-schemas.conf:
Fill some points at yesterday:
Send cyclic periods in one interval 10 hours ago:
At same time we can see as old points will be overwritten:
Points with same timestamp duplicates at database: