Closed nicholasjng closed 5 months ago
Attention: 1 lines
in your changes are missing coverage. Please review.
Comparison is base (
cc7d289
) 94.08% compared to head (f6b0cca
) 94.24%.
Files | Patch % | Lines |
---|---|---|
src/lakefs_spec/transaction.py | 98.63% | 0 Missing and 1 partial :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Stateful transaction support requested in https://github.com/fsspec/filesystem_spec/issues/1517.
Another idea is to give default arguments to the file system constructor, but these are stickier / more opaque than initializer arguments to the transaction itself. Passing arguments to the transaction has to be solved on the fsspec level, however, if there is interest at all by the maintainers.
In case there is no upstream solution, there still is the third option, which I explored in my prototype: turning LakeFSTransaction
into a callable object, which allows passing (optional) configuration when using it as a context manager, but still being compatible with the way that fsspec instantiates the transaction class.
This is a breaking change, as it requires users to give arguments to the return object of
LakeFSFileSystem.transaction
.Changes the
LakeFSTransaction
class to become a repository+branch-scoped ephemeral branch.File uploads, removals, and commits happen on that ephemeral branch. After transaction exit, the created branch is optionally merged back into the base branch.
The base branch (from which we branch off in the transaction) must exist before start of the transaction.
Some transaction APIs (merge, revert, tag) can also work on branches other than the transaction's ephemeral branch - this behavior might not be well-defined, and is up for debate on removal.
Similarly, on puts and deletes, one currently has to specify the transaction branch directly via the
LakeFSTransaction.branch
property. This can be made nicer by allowing resource paths relative to the transaction's repo and branch in special functions like e.g.LakeFSTransaction.put()
(deferring tofs.put()
),LakeFSTransaction.rm()
(deferring tofs.rm()
), and so on.This pull request turns the
LakeFSTransaction
into a callable, which needs to be called with a repository name at minimum to ensure transaction execution.