databricks-industry-solutions / pixels

Facilitates simple large scale processing of HLS Medical images, documents, zip files. Previously at https://github.com/dmoore247/pixels
https://databricks-industry-solutions.github.io/pixels/
Other
25 stars 15 forks source link

AWS Concurrent update due to concurrent processing of PUSH requests #25

Open dmoore247 opened 1 year ago

dmoore247 commented 1 year ago

We've seen this before @dbbnicole where I think, closely timed PUSHs end up interfering with each other.

Environment AWS build environment

Fully reproducible code snippet https://github.com/databricks-industry-solutions/pixels/actions/runs/6728292245/job/18287413930

Full error message ConcurrentAppendException: Files were added to the root of the table by a concurrent update. Please try the operation again. Conflicting commit: {"timestamp":1698900146066,"userId":"5215232244814299","userName":"jingting.lu@databricks.com","operation":"CREATE OR REPLACE TABLE AS SELECT","operationParameters":{"isManaged":true,"description":null,"partitionBy":[],"properties":{"delta.autoOptimize.autoCompact":"true","delta.autoOptimize.optimizeWrite":"true","delta.targetFileSize":"16mb"}},"job":{"jobId":"956995447075586","jobName":"[RUNNER] 4e4e8920c50c856d9dcb0265ccaee7e09b5bf077 | 4a2cee290591229fe8272a7eafa3144a7690b146d13d4f56ed22d5e317d02d47","runId":"505310858445190","jobOwnerId":"5215232244814299","triggerType":"manual"},"notebook":{"notebookId":"2437486787348261"},"clusterId":"1102-043119-bqziscvh","readVersion":42,"isolationLevel":"WriteSerializable","isBlindAppend":false,"operationMetrics":{"numFiles":"32","numOutputRows":"10426","numOutputBytes":"1638980087"},"engineInfo":"Databricks-Runtime/11.3.x-cpu-ml-scala2.12","txnId":"be532363-1c96-4eba-b43f-56a77cd9801b"} Refer to https://docs.databricks.com/delta/concurrency-control.html for more details.