Closed giovannipizzi closed 1 year ago
Patch coverage: 98.86
% and project coverage change: +0.03
:tada:
Comparison is base (
f1809d4
) 99.52% compared to head (ca1c1cb
) 99.55%.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
For reading packed file as it is, there is no need to restrict the
whence
parameter of theseek
method to be only 0 or 1. In this PR, the main goal is to enablewhence=2
, i.e. search from the bottom of a file, needed by some formats/libraries.Compressed files are more tricky, as it is not possible to freely seek to the end (at least not in a cheap way). Instead, the entire files will be decompressed back into a loose file, which will then be opened for reading.
If such file exists already it will be used, so we don't decompress twice. Such "cache" files are deleted during the routine maintainance operations (e.g.
clean_storage
).In the current PR, upon certain conditions (now well defined, i.e. when seeking with the following conditions):
To achieve this goal in a robust way, we define a LazyLooseStream class that allows to define which loose file to open, delaying the opening to a later point, and in this way enabling code that ensures that always closes any open file.
I also added code to ensure that there should not be race conditions if a
clean_storage
is running at the same time.Furthermore, I cleaned up a bit the code and added various tests to increased coverage, since it had dropped over time. It didn't go back to 100% but we are close (for the core library files).
Furthermore, I used the occasion to a new
validate
CLI command that also uses tqdm (if installed) to show progress.This PR fixes #136. This also replaces and thus closes #140 and closes #141