Open sdd opened 1 month ago
I've updated this to ditch the concurrency when processing ManifestEntry
items within a single Manifest
, producing them asynchronously but sequentially instead. I've kept the limited concurrency when processing ManifestFile
s within the scan's snapshot's ManifestList
.
I've kept the approach of using an mpsc channel with a spawned task, with that task using try_for_each_concurrent
to achieve the concurrency. This is because without the channel and spawned task, we'd need to use an async closure, which is unstable rust. With the spawned task we only need to use an async block, which is in stable rust.
This is a bit of an experiment to see how things could look if we tried to:
I'd like to add some unit tests to confirm that this behaves as expected beyond the existing tests that we have for TableScan, and add an integration / performance test that can quantify any performance improvements (or regressions 😅 ) that we get from these changes.
Let me know what you all think.