apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
6.23k stars 2.17k forks source link

Hive's performance for querying the Iceberg table is very poor. #8901

Open BsoBird opened 11 months ago

BsoBird commented 11 months ago

Query engine

HIVE 3.1.3 ICEBERTG 1.3.1 SPARK 3.3.2

Question

Hi Team. I recently was testing Hive query Iceberg table , I found that Hive query Iceberg table performance is very very poor . Almost impossible to use in the production environment . And Join conditions can not be pushed down to the Iceberg partition. I'm using the 1.3.1 Hive Runtime Jar from the Iceberg community. Currently I'm using Hive 3.1.3, Iceberg 1.3.1. Now I'm very frustrated because the performance is so bad that I can't deliver to my customers. How can I solve this problem? Details: https://apache-iceberg.slack.com/archives/C025PH0G1D4/p1695050248606629 I would be grateful if someone could guide me.

lurnagao-dahua commented 11 months ago

Hi, In Hive3,DML operations work only with MapReduce execution engine. Bad performance of the MR execution engine

pvary commented 11 months ago

You might want to try Hive 4.0.0-beta-1 which has plenty of related performance improvements

pvary commented 11 months ago

Also Iceberg is included out-of-the-box

github-actions[bot] commented 1 week ago

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.