cloudera-labs / hms-mirror

"hms-mirror" is a utility used to bridge the gap between two clusters and migrate hive metadata.
Apache License 2.0
13 stars 8 forks source link

issue while copying iceberg table from Left to right clusters #34

Open hpasumarthi opened 1 year ago

hpasumarthi commented 1 year ago

Hello Team, We have been seeing strange issue while running hms-mirror to copy iceberg tables from Left to Right environment.

As a part of the command

DROP TABLE IF EXISTS hms_mirror_shadow_hms_iceberg;

Error message

2023-04-17 15:15:00,895 [pool-1-thread-1] INFO Cluster.runTableSql(484):RIGHT:SQL:Loading table from Shadow:FROM hms_mirror_shadow_hms_iceberg INSERT OVERWRITE TABLE hms_iceberg SELECT *
2023-04-17 15:15:01,670 [pool-1-thread-1] ERROR Cluster.runTableSql(491):java.sql.SQLException: Error while compiling statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask. data/warehouse/tablespace/external/hive/hms_test.db/_tmp.hms_iceberg: PUT 0-byte object  on data/warehouse/tablespace/external/hive/hms_test.db/_tmp.hms_iceberg: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: CDDSXDBDYX3P5ERK; S3 Extended Request ID: LrT3JZopIna3Hrd+VPymRs1FapcdZ63PzYhGmsBmkw5WBVvgSBQfjkTmcMDDjz4W41Wh6rw24WGMlRAurNf/SA==; Proxy: null), S3 Extended Request ID: LrT3JZopIna3Hrd+VPymRs1FapcdZ63PzYhGmsBmkw5WBVvgSBQfjkTmcMDDjz4W41Wh6rw24WGMlRAurNf/SA==:AccessDenied
2023-04-17 15:15:01,727 [pool-1-thread-1] INFO Transfer.call(233):Migration complete for hms_test.hms_iceberg in 5225ms

We have granted users from left environment only read access to right environment buckets. Looks like drop command is trying to write into _tmp.hms_icerbeg file which cannot be possible. Can we check if we can avoid this while running drop table command.

Hemanth

dstreev commented 1 year ago

@hpasumarthi can you add the context in how your calling hms-mirror. The commandline, etc.. The RIGHT needs to be able to ensure the shadow table are 'reset'. So the right user 'will' need access to the 'left' storage in this case to 'clean up' these artifacts. If they aren't allowed that type of access, you could try an intermediate-storage location. This way you wouldn't have to be concerned about RIGHT to LEFT access.