Open TheR1sing3un opened 3 weeks ago
Only downside is - users partitioning things too granular, leading to bombardment of metrics systems downstream..
I see how its useful though.
for example, p99 latency of compaction operation for specified partition
Is it feasible to extend the compaction metrics a little bit, maybe just represent the latecy metrics in another level: aggregated by partitions.
Only downside is - users partitioning things too granular, leading to bombardment of metrics systems downstream..
Yes, we also need to consider the case of too many partitions, I think we can provide this ability, by the actual user to consider whether to turn on.
for example, p99 latency of compaction operation for specified partition
Is it feasible to extend the compaction metrics a little bit, maybe just represent the latecy metrics in another level: aggregated by partitions.
I plan to provide a Histogram
aggregated by partition. It will records compaction stats such as:
Can we provide partition-level metrics? In many scenarios where partitions are used, such as
p_date
andp_product
, which are separated by time or type, the data before the partition is quite different. Can we provide a partition dimension metrics to reflect some metrics, for example, p99 latency of compaction operation for specified partition? This will help a lot when doing performance optimization.Tips before filing an issue
Have you gone through our FAQs?
Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
If you have triaged this as a bug, then file an issue directly.
Describe the problem you faced
A clear and concise description of the problem.
To Reproduce
Steps to reproduce the behavior:
1. 2. 3. 4.
Expected behavior
A clear and concise description of what you expected to happen.
Environment Description
Hudi version :
Spark version :
Hive version :
Hadoop version :
Storage (HDFS/S3/GCS..) :
Running on Docker? (yes/no) :
Additional context
Add any other context about the problem here.
Stacktrace
Add the stacktrace of the error.