Closed adriangb closed 4 months ago
ACTION NEEDED
delta-rs follows the Conventional Commits specification for release automation.
The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.
@adriangb I'm not opposed to the idea, can you share more about how this can be used outside of the delta context, I'm curious
This isn't to use outside of the delta context necessarily, just hook in more low level and with less code churn.
I have a Rust service that does buffering of data before writing to Delta. I control all of the writes going to Delta so I want to avoid all of the overhead of checking that the schemas match, etc which happens if you go through the public APIs. I also want to take control of the write retry loop because I may have to switch to pessimistic concurrency control under high write load (take out a lock on a low latency system like Redis only at the stage of writing the commit) to avoid write contention.
Currently data is written out manually using ArrowWriter as hive partitioned parquet. So if I can hook into PartitionWriter it's relatively little code changes (actually saves me a good bit of code doing the partitioning) and I can then go implement the writing of the commits as a later step to convert to a delta table.
Sounds reasonable to me! I think this API is stable enough to make public in the next release, please run a cargo fmt
and then this should merge
done @rtyler ! Seems like it dismissed your review :(
@rtyler @adriangb - recently I have been thinking if we should adopt the visibility crate to expose a larger api behind a feature gate and less guarantees. We uses this - so gar successfully I believe- in kernel ...
There also is some writing stuff - e.g. muti-part put from object_store on my todo list, that may affect this.
@roeap not sure I understood fully, are you saying that this right now should not be made public or that there is an opportunity for a more thoughtful approach to a public API within the rust crates in the future?
@roeap best to wait for object store 0.10 before working on the multi part
@aersam also already put some work there
@adriangb - i think we can make this public now, but maybe behind a feature gate. The visibility crate enables this.
@ion-elgreco - yes, still some things to cross off first, and we'll work from where we are, when i get to it 🙂.
i think we can make this public now, but maybe behind a feature gate. The visibility crate enables this.
Interesting, TIL!
I would like to use PartitionWriter within my own crate. Would you mind making this public so I don't have to vendor it and all of the associated code?