Open Fiery-Fenix opened 3 months ago
Pinging code owners:
exporter/clickhouse: @hanjm @dmitryax @Frapschen @SpencerTorres
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Hey, thanks for writing this. This is a reasonable request overall, few notes:
Using single table for all metrics types, i.e. otel_metrics, which bring better compression and usability - no need to search for required metric across 5 different tables
I believe this would negatively affect querying, depending on the primary key / aggregation, but the schemas can be replaced by the user so that's not too relevant. I'm unsure if these tables would remain compatible long term. If we change the insert SQL for one kind of metric but not the other, then it wouldn't be possible to use the same table. This can be addressed later however.
Add configuration parameters with metrics tables suffix override. . . . Another option might be to deprecate current metrics_table_name configuration parameter and introduce 5 new instead:
If this were to be added, it would be nice to see them under one field:
metrics_table_suffix:
gauge: "_gauge"
sum: "_counter"
summary: "_summary"
histogram: "_histogram"
exponential_histogram: "_exponential_histogram"
Between the two solutions you suggested, it would be best to use the full table name and not just the suffix. It has the most flexibility.
As a temporary workaround, you could always use a materialized view to forward _sum
to _counter
.
Thanks for the suggestion! Always happy to improve flexibility. Interested in seeing what others think
Hello! I would like to take this ticket if possible
I believe this would negatively affect querying, depending on the primary key / aggregation, but the schemas can be replaced by the user so that's not too relevant. I'm unsure if these tables would remain compatible long term. If we change the insert SQL for one kind of metric but not the other, then it wouldn't be possible to use the same table. This can be addressed later however.
As for performance of single metrics table - on my small set of ~6Bil rows I haven't found statistically significant performance degradation comparing to current, multi-table approach. I know that ClickHouse is very well optimized for smaller amount of tables with bigger size. That's why I was trying to test my sinlge-table approach on production-like environment with bigger amount of data. That's how this issue was born ;) Related to compatibility with insert SQL queries - as long as single table will have all fields from other 5 tables - there will be 0 issues with this approach. ClickHouse inserts default value (or Null if column is Nullable) in case of absence of this column in insert SQL. I've already tested this approach with custom built OpenTelemetry Collector (just renamed all metrics tables in code) - and it works as expected
Between the two solutions you suggested, it would be best to use the full table name and not just the suffix. It has the most flexibility.
Agree, that's looks as a best solution and also aligned with other tables names overrides (like logs and traces)
@odubajDT assigned
This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers
. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself.
Not stale, waiting on https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/34251
Component(s)
exporter/clickhouse
Is your feature request related to a problem? Please describe.
Right now table names for metrics are semi-hardcoded, i.e. it's using configurable prefix from
metrics_table_name
parameter and hardcoded suffix for each metric type, like_gauge
/_sum
/_summary
/etc. There is no possibility to change this suffixes outside of the code, but this might be useful for such use-cases:otel_metrics
, which bring better compression and usability - no need to search for required metric across 5 different tablesotel_metrics_sum
table tootel_metrics_counter
which is more comfortable naming for those who is more familiar with Prometheus metrics typesDescribe the solution you'd like
Add configuration parameters with metrics tables suffix override. For example:
metrics_table_gauge_suffix: "_gauge"
metrics_table_sum_suffix: "_sum"
metrics_table_summary_suffix: "_summary"
metrics_table_histogram_suffix: "_histogram"
metrics_table_exponential_histogram_suffix: "_exponential_histogram"
Another option might be to deprecate current
metrics_table_name
configuration parameter and introduce 5 new instead:metrics_table_gauge_name: "otel_metrics_gauge"
metrics_table_sum_name: "otel_metrics_sum"
metrics_table_summary_name: "otel_metrics_summary"
metrics_table_histogram_name: "otel_metrics_histogram"
metrics_table_exponential_histogram_name: "otel_metrics_exponential_histogram"
Describe alternatives you've considered
No response
Additional context
No response