polarsignals / frostdb

❄️ Coolest database around 🧊 Embeddable column database written in Go.
Apache License 2.0
1.27k stars 65 forks source link

There is an issue with the current snapshot losing some ManifestEntry when Iceberg concurrently executes uploads #891

Closed jicanghaixb closed 2 months ago

jicanghaixb commented 2 months ago

The current iceberg runs the Upload function concurrently (when frost db quickly injects data), There is competition leading to partial loss of data in the current snapshot's ManifestEntry, maybe Upload need add a lock

thorfour commented 2 months ago

Any chance you're able to recreate this scenario with a unit test?

thorfour commented 2 months ago

Nevermind I assume that you're likely using the HDFS catalog, which isn't meant for concurrent writers. I think we can simply add a lock to the snapshot writer of the hdfs catalog.

jicanghaixb commented 2 months ago

Thanks, I wrap Iceberg and added an upload lock to avoid the issue