Open bkosuru opened 2 years ago
cc @nsivabalan , i have some impression that we have fixed the performance regression, do you remember which patch ?
Insert in 0.8.0 - 5.7 min Insert in 0.10.1 - 12 min Insert in 0.11.1 - 5.7 min
Upsert in 0.8.0 - 27min Upsert in 0.10.1 - 66min Upsert in 0.11.1 - 42.5min
Using .option("hoodie.metadata.index.bloom.filter.enable", "true") .option("hoodie.metadata.index.column.stats.enable", "true") .option("hoodie.index.type", "BLOOM")
@bkosuru for 0.11.1, could you turn off column stats and bloom filter in metadata table and see if that helps bring the write latency on par?
@yihua I will test when I get a chance. But since insert is performing well in 0.11.1 we will upgrade to 0.11.1 upsert usecase is rare for us. You can change the priority to minor. Thanks
cc @nsivabalan , i have some impression that we have fixed the performance regression, do you remember which patch ?
seems to be related to #4012, which is fixed in 0.11.0
yes, we have made quite few fixes around perf in 0.12. Can you wait for couple of days and give 0.12 a try. Highly recommend if you are looking for better performance.
@bkosuru : we have made lot of fixes around perf in 0.12 on both read and write side. can you try 0.12 and let us know what you see. please disable bloom filter and column stats. Try w/ and w/ enabling metadata as well. curious to know how this fares.
@bkosuru : we have made lot of fixes around perf in 0.12 on both read and write side. can you try 0.12 and let us know what you see. please disable bloom filter and column stats. Try w/ and w/ enabling metadata as well. curious to know how this fares.
@bkosuru would you mind re-do the benchmark using 0.12.1 ? would like to verify if perf gaps are resolved.
This will take me a while to setup and test. I will do it when time permits. Thanks!
@bkosuru Did updating to latest version improved the performance? Do you still need help on this?
Hello,
Trying to upgrade hdfs based table to hudi 0.10.1 from 0.8.0. We cannot upgrade to 0.11.1 yet. Noticed a big performance hit with Insert/Upsert.
Insert in 0.8.0 - 5.7 min Insert in 0.10.1 - 12 min
Upsert in 0.8.0 - 27min Upsert in 0.10.1 - 66min
The writer config we use is here - https://github.com/apache/hudi/issues/5741 Is this is a known issue? Is there any additional setting we need to use in 0.10.1?
Here is the screenshots for Upsert:
Thanks, Bindu