snowplow-incubator / snowplow-lake-loader

Snowplow Lake Loader
Other
0 stars 3 forks source link

Hudi loader should fail early if missing permissions on Glue catalog #72

Closed istreeter closed 3 months ago

istreeter commented 4 months ago

It is possible to run the Hudi Lake Loader enabling the hudi option "hoodie.datasource.hive_sync.enable": "true" to register/sync the table to a Hive Metastore or Glue.

However, with that setting enabled, the Hudi delays syncing until the first time events are committed. For use case, it is more helpful if the loader connects to Glue/Hive during startup, so we more quickly get an alert if the loader is missing permissions.

This PR works my making the loader add an empty commit during startup. It does not add any parquet file, but it triggers the loader to sync the table to Glue/Hive.