I've enabled elementary for my dbt projects. On the biggest one I'm running every dbt model as separate airflow task and I have problem on on-run-end hook level. When elementary tries to insert new data, the problem occurs on parallel model runs. I'm getting the following error
[2024-05-20, 11:37:52 UTC] {pod_manager.py:484} INFO - [base] {'status': 'error', 'message': "on-run-end failed, error:\n ICEBERG_COMMIT_ERROR: Failed to commit Iceberg update to the table: . If a data manifest file was generated at 's3://.../results/dbt/f4ea24c3-29ec-43a7-bc46-a1fca4ece38c-manifest.csv', you may need to manually clean the data from locations specified in the manifest. Athena will not delete data in your account.", 'thread': 'main', 'execution_time': 0, 'num_failures': 1, 'timing_info': [], 'adapter_response': {}}
The problem is because iceberg does not support concurrent inserts. I've added query retry with wait_random_exponential instead of wai_exponential on such error to fix the problem.
Description
I've enabled elementary for my dbt projects. On the biggest one I'm running every dbt model as separate airflow task and I have problem on on-run-end hook level. When elementary tries to insert new data, the problem occurs on parallel model runs. I'm getting the following error
The problem is because iceberg does not support concurrent inserts. I've added query retry with
wait_random_exponential
instead ofwai_exponential
on such error to fix the problem.Checklist