go-graphite / carbon-clickhouse

Graphite metrics receiver with ClickHouse as storage
MIT License
186 stars 47 forks source link

Tag table zero-timestamp #150

Open Hipska opened 1 week ago

Hipska commented 1 week ago

Is there any reason the config option for upload zero-timestamp is not honoured for type = "tagged"? IMHO, the storage could be optimised if it would just store value 0, I don't think I use the Version column in my setup?

   ┌─table───────────────────┬─column──┬─type──────────┬───────rows─┬─disk───────┬─avg_size─┬─compressed─┬─uncompressed─┬─────compress_ratio─┐
1. │ default.graphite_tagged │ Date    │ Date          │ 8480650637 │ 357.97 MiB │ 0.04 B   │ 344.39 MiB │ 15.80 GiB    │   46.9677337727716 │
2. │ default.graphite_tagged │ Tag1    │ String        │ 8480650637 │ 1.17 GiB   │ 0.15 B   │ 1.16 GiB   │ 174.72 GiB   │ 151.10332522990007 │
3. │ default.graphite_tagged │ Path    │ String        │ 8480650637 │ 23.64 GiB  │ 2.99 B   │ 23.63 GiB  │ 1.43 TiB     │ 62.049363313301114 │
4. │ default.graphite_tagged │ Tags    │ Array(String) │ 8480650637 │ 49.53 GiB  │ 6.27 B   │ 49.50 GiB  │ 1.51 TiB     │ 31.323010902004516 │
5. │ default.graphite_tagged │ Version │ UInt32        │ 8480650637 │ 2.62 GiB   │ 0.33 B   │ 2.61 GiB   │ 31.59 GiB    │ 12.118893459540326 │
   └─────────────────────────┴─────────┴───────────────┴────────────┴────────────┴──────────┴────────────┴──────────────┴────────────────────┘
Felixoid commented 1 week ago

I think nobody wanted to have all possible tags for all the time; that's why the index table feature zero-timestamp is not supported in the graphite_tagged.

The Version column should be used in ENGINE = ReplacingMergeTree(Version) as the latest version in the partition for (Tag1, Path, Date) combination.

Hipska commented 1 week ago

Right, thanks for pointing out. So what if one would use ReplacingMergeTree(Date) instead?

Felixoid commented 1 week ago

The most straightforward shot I suggest is using CODEC(DoubleDelta, ZSTD) as Version's codec. You will be surprised by how much better it is. Another thing that could be tested is the addition of TTL. If you'll succeed with it, updating the documentation would be a significant improvement!

ReplacingMergeTree(Date) will be unstable. I don't know how CH decides which row to keep on the same values.

The implementation details of the tagged table are tricky for me. So, getting rid of the Version should be in sync with graphite-clickhouse and carbon-clickhouse.

Hipska commented 1 week ago

ReplacingMergeTree(Date) will be unstable. I don't know how CH decides which row to keep on the same values.

Does it matter, since they're the same values?

The implementation details of the tagged table are tricky for me. So, getting rid of the Version should be in sync with graphite-clickhouse and carbon-clickhouse.

It's not getting rid of it, just setting the value always to 0 (given that you adjust the table engine as well). I only see one use of that column in graphite-clickhouse, and that seems to be for its own tags implementation, not regular graphite tags..

Felixoid commented 1 week ago

In graphite-clickhouse it's post-aggregation for the ReplacingMergeTree, exactly the same purpose.

By adding the TTL and CODEC to default.tagging_graphite you'll have the same as Version=0 without changing the code.