Eventual-Inc / Daft

Distributed data engine for Python/SQL designed for the cloud, powered by Rust
https://getdaft.io
Apache License 2.0
2.34k stars 164 forks source link

[FEAT] Lance writes for swordfish #3299

Closed colin-ho closed 1 week ago

colin-ho commented 1 week ago

This PR implements Lance writes for swordfish.

The scaffolding for writes was merged in: https://github.com/Eventual-Inc/Daft/pull/2992, and so this one simply adds the lance writes functionality.

codspeed-hq[bot] commented 1 week ago

CodSpeed Performance Report

Merging #3299 will degrade performances by 29.77%

Comparing colin/swordfish-lance (a208770) with main (711e862)

Summary

❌ 2 regressions ✅ 15 untouched benchmarks

:warning: Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main colin/swordfish-lance Change
test_count[1 Small File] 3.6 ms 4 ms -10.58%
test_show[100 Small Files] 22.3 ms 31.7 ms -29.77%
codecov[bot] commented 1 week ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 77.52%. Comparing base (c4e1ab2) to head (a208770). Report is 10 commits behind head on main.

Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3299/graphs/tree.svg?width=650&height=150&src=pr&token=J430QVFE89&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3299?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) ```diff @@ Coverage Diff @@ ## main #3299 +/- ## ========================================== + Coverage 77.50% 77.52% +0.02% ========================================== Files 666 667 +1 Lines 81333 81424 +91 ========================================== + Hits 63036 63128 +92 + Misses 18297 18296 -1 ``` | [Files with missing lines](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3299?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc) | Coverage Δ | | |---|---|---| | [src/daft-local-execution/src/pipeline.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3299?src=pr&el=tree&filepath=src%2Fdaft-local-execution%2Fsrc%2Fpipeline.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtZXhlY3V0aW9uL3NyYy9waXBlbGluZS5ycw==) | `95.45% <100.00%> (+0.15%)` | :arrow_up: | | [src/daft-local-execution/src/sinks/write.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3299?src=pr&el=tree&filepath=src%2Fdaft-local-execution%2Fsrc%2Fsinks%2Fwrite.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtZXhlY3V0aW9uL3NyYy9zaW5rcy93cml0ZS5ycw==) | `100.00% <100.00%> (ø)` | | | [src/daft-local-plan/src/plan.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3299?src=pr&el=tree&filepath=src%2Fdaft-local-plan%2Fsrc%2Fplan.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtcGxhbi9zcmMvcGxhbi5ycw==) | `96.57% <100.00%> (+0.18%)` | :arrow_up: | | [src/daft-local-plan/src/translate.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3299?src=pr&el=tree&filepath=src%2Fdaft-local-plan%2Fsrc%2Ftranslate.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtbG9jYWwtcGxhbi9zcmMvdHJhbnNsYXRlLnJz) | `93.90% <100.00%> (+0.31%)` | :arrow_up: | | [src/daft-writers/src/catalog.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3299?src=pr&el=tree&filepath=src%2Fdaft-writers%2Fsrc%2Fcatalog.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtd3JpdGVycy9zcmMvY2F0YWxvZy5ycw==) | `92.72% <ø> (ø)` | | | [src/daft-writers/src/lance.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3299?src=pr&el=tree&filepath=src%2Fdaft-writers%2Fsrc%2Flance.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtd3JpdGVycy9zcmMvbGFuY2UucnM=) | `100.00% <100.00%> (ø)` | | | [src/daft-writers/src/lib.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3299?src=pr&el=tree&filepath=src%2Fdaft-writers%2Fsrc%2Flib.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtd3JpdGVycy9zcmMvbGliLnJz) | `96.07% <ø> (ø)` | | | [src/daft-writers/src/physical.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3299?src=pr&el=tree&filepath=src%2Fdaft-writers%2Fsrc%2Fphysical.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtd3JpdGVycy9zcmMvcGh5c2ljYWwucnM=) | `85.71% <100.00%> (ø)` | | | [src/daft-writers/src/pyarrow.rs](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3299?src=pr&el=tree&filepath=src%2Fdaft-writers%2Fsrc%2Fpyarrow.rs&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc#diff-c3JjL2RhZnQtd3JpdGVycy9zcmMvcHlhcnJvdy5ycw==) | `99.38% <ø> (ø)` | | ... and [1 file with indirect coverage changes](https://app.codecov.io/gh/Eventual-Inc/Daft/pull/3299/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Eventual-Inc)