Open floriandaniel opened 2 years ago
@alexeykudinkin : can you take a look at this.
Hey, @floriandaniel! Thanks for taking the time to file very detailed description.
First of all i believe the crux of the problem is likely lying in the realms of using Bloom Index of the Metadata table: we've recently identified a performance gap in there and @yihua is currently working on addressing that (there's already https://github.com/apache/hudi/pull/6432 in progress).
Second, i'd recommend you to do following in your evaluation:
hoodie.bloom.index.use.metadata
for now (until above fix lands)hoodie.bloom.index.prune.by.ranges
? It's very crucial aspect of using the Bloom Index that allows to prune the search space considerably for update-heavy workloads only checking the files that could contain the target records (and eliminating ones that couldn't)
Problem I'm testing the ability of Apache Hudi to make upserts faster than the current functions on Spark. Each record contains 40 fields. The partitioning key is country_iso (a string field). There are 200 different values for this field. The partitions are quite unbalanced (US, China have much records). The problem is that I'm getting very slow performance with small datasets (~1Gb) I'm updating a string field which is not the partitioning key and the record key. The ratio of updates in my upsert dataset : 100%.
This could come from the way of partitioning my Parquet file, the unbalanced partitioning, choose another partitioning key ...
Environment Description
Hudi version : 0.11.1
Spark version : 3.1.2-amzn-1
Hive version :
Hadoop version : 3.2.1 (Amazon)
Storage (HDFS/S3/GCS..) : S3
Running on Docker? (yes/no) : no
AWS EMR : emr-6.5.0, 1 master (r5.xlarge), 2 cores (r5d.2xlarge)
Additional context
Add any other context about the problem here.
Hudi Config
(nb in millions)
(size in Gb)
(nb in millions)
(size in Gb)
(time in mins)
(time in mins)
(0.9 Gb)
(0.05 Gb)
(7.9 Gb)
(0.55 Gb)
(18.7 Gb)
(1.1 Gb)
The image below show the partition /BN, with very small parquet files.
Here is the Spark trace of an upsert with Bloom index (sample_10)
IMAGE 1. Building workload profile: BLOOM_hudi_sample_10 (duration : 13 min),
IMAGE 2. Doing partition and writing data: BLOOM_hudi_sample_10, (duration : ~8mins) :