Open nileshpatra opened 3 weeks ago
If I do not initialize ORAS_CACHE and try to pull in to an output dir and if I cancel the context, oras seems to now download partial thing and clean it up. So it seems oras seems to already manage partial downloads, is it?
I need to use it in a script and hence need to know if handling it is needed.
/cc: @qweeah
@nileshpatra Thanks for the valuable feedback!
Firstly, the user experience of oras pull
is like cp
commands: if a copy process is aborted halfway, then copied files in the destination folder will not be cleaned.
Secondly, you can set up the ORAS_CACHE
variable to use a folder as a temporary cache for storing the files in a content-addressable way. But by design, ORAS CLI is not responsible for cleaning that folder.
it'd be good for oras to cleanup things in an intermediate state in order to start downloading again
IMHO, to speed up re-pull, ORAS doesn't need to cleanup things. The right thing to do is checking the existence of a to-be-pulled file and skip it if an identical copy is already there.
On Mon, Jun 10, 2024 at 11:03:19PM -0700, Billy Zha wrote:
@nileshpatra Thanks for the valuable feedback!
Thanks for your response!
Firstly, the user experience of
oras pull
is likecp
commands: if a copy process is aborted halfway, then copied files in the destination folder will not be cleaned.
If I understand correctly, oras pull
will try to download the artifacts and
manifests and if aborted, the downloaded stuff will not be cleaned, correct?
Secondly, you can set up the
ORAS_CACHE
variable to use a folder as a temporary cache for storing the files in a content-addressable way. But by design, ORAS CLI is not responsible for cleaning that folder.
Right. Does it "copy" things from any place to another place even if there's no cache setup?
On Mon, Jun 10, 2024 at 11:16:12PM -0700, Billy Zha wrote:
it'd be good for oras to cleanup things in an intermediate state in order to start downloading again
IMHO, to speed up re-pull, ORAS doesn't need to cleanup things. The right thing to do is detect existent files and skip those.
Correct, but what if the downloaded files are partially downloaded, i.e. the entire artifact isn't present?
Correct, but what if the downloaded files are partially downloaded, i.e. the entire artifact isn't present?
1) One artifact can contain more than one file 2) If a file is partially downloaded, the checksum won't match the layer digest and thus won't be recognized as existed file
If I understand correctly,
oras pull
will try to download the artifacts and manifests and if aborted, the downloaded stuff will not be cleaned, correct?
Yes.
Right. Does it "copy" things from any place to another place even if there's no cache setup?
If there is no cache setup, there is no partial file created in the local file system.
As is mentioned in https://github.com/oras-project/oras-go/issues/777, the performance of oras pull
can be optimized if oras-go can skip copying existed file.
What is the version of your ORAS CLI
1.2.0
What would you like to be added?
I could not find it anywhere written in the docs if oras pull cleans up the output directory or tempfiles in case
oras pull
is unsuccessful due to for instance poor network connection.Why is this needed for ORAS?
If this feature is not already present, it'd be good for oras to cleanup things in an intermediate state in order to start downloading again, or at least it should give such an option to enable.
Are you willing to submit PRs to contribute to this feature?