Alluxio / alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud
https://www.alluxio.io
Apache License 2.0
6.87k stars 2.94k forks source link

prestodb query alluxio hudi partitioned table error #17659

Open qingyuan18 opened 1 year ago

qingyuan18 commented 1 year ago

Alluxio Version: 2.7.3

Describe the bug alluxio backend hudi paritioned table use prestodb to query the hudi table , it throws error : Query 20230620_011155_00002_kx3sk failed: Partition path does not belong to base-path

To Reproduce Steps to reproduce the behavior (as minimally and precisely as possible) 1: write hudi table with parition in alluxio backend ufs path (e.g: alluxio ufs path is s3://salunchbucket/data/alluxio/) 2: set hudi table path with alluxio filesystem path ( e.g: alter table tpcds_text_1000.dwd_charge_transaction_record_v_partition set location "alluxio:///1000/dwd_charge_transaction_record_v_partition") 3: load hudi data into alluxio (e.g: alluxio fs load /1000/) 4: query hudi table with prestodb (e.g: select count(1) from tpcds_text_1000.dwd_charge_transaction_record_v_partition)

Expected behavior query return result like hive query the same hudi alluxio cache table which works fine

Urgency Describe the impact and urgency of the bug.

Are you planning to fix it Please indicate if you are already working on a PR.

Additional context Add any other context about the problem here.

LuQQiu commented 1 year ago

@JiamingMai have you seen this error before? use prestodb to query the hudi table , it throws error : Query 20230620_011155_00002_kx3sk failed: Partition path does not belong to base-path

@qingyuan18 can you add the alluxio logs here? under ${ALLUXIO_HOME}/logs this can help us better understand where the error come from