apache / iceberg-python

Apache PyIceberg
https://py.iceberg.apache.org/
Apache License 2.0
459 stars 166 forks source link

Support commit retries #269

Open Fokko opened 9 months ago

Fokko commented 9 months ago

Feature Request / Improvement

Within Iceberg, when a commit fails because of a concurrent operation, we can retry the operation by loading the latest version of the snapshot, and re-apply the operation.

nicor88 commented 9 months ago

Few suggestions on this feature. It will be good to have control of the amount of retries and the retry strategy. After trying out a few retries libraries I found tenacity one of the most complete because it allows different options for retrying.

github-actions[bot] commented 3 months ago

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] commented 3 months ago

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'

sungwy commented 2 months ago

mark not stale

kevinjqliu commented 4 weeks ago

As a workaround, to manually retry commits, update the table metadata by using

table = table.refresh()

before calling commit() again

maxlucuta commented 4 weeks ago

Have also have experience not being able write to tables in highly distributed environments. Refreshing the table in isolation, in addition to adding some retry logic did not work. The solution we found to work involved:

  1. Refreshing the table.
  2. Creating a new transaction from the refreshed table.
  3. Generating new snapshots for the data files involved in the previous transaction.
  4. Trying to commit again, if fails go back to step 1.
mark-major commented 3 weeks ago

@maxlucuta Yes, that's what I have been using. It would be nice if there would be an internal retry for the commit so the client application doesn't have to be polluted with the retry.

It would be a difficulty to handle the local table metadata object if the retry is pushed to the library level. The commit would succeed, but the metadata object would point to a whole different snapshot.