oras-project / oras-py

ORAS Python SDK
https://oras-project.github.io/oras-py/
Apache License 2.0
40 stars 36 forks source link

Enable chunked uploads #150

Closed isinyaaa closed 2 months ago

isinyaaa commented 3 months ago

TL;DR: Rebases #137. Also fixes out-of-date _state_ parameter on session_url, which caused a 404 when resuming/completing uploads.

We wanted to use oras for larger uploads (say, ML model files) at containers/omlmd, but I wasn't able to make them work with standard uploads. I noticed #137, rebased it, addressed some of your comments (not sure how to address all of them, though). But I still couldn't make it work with https://hub.docker.com/_/registry. So I started debbuging to find out there's a _state_ parameter being passed around, and I assume it must be updated on the https://distribution.github.io/distribution/spec/api/#completed-upload PUT request to reflect the last state reported by the server. I tested this with 20GB-ish files.

I wonder if this could help with bringing back chunked file support by default, although I'm not really sure in which aspect that kind of support is "flaky" or (as I experienced) poorly documented.

vsoch commented 3 months ago

@isinyaaa please see my previous review comment - we need explicit tests for the chunked upload.

isinyaaa commented 3 months ago

@isinyaaa please see my previous review comment - we need explicit tests for the chunked upload.

@vsoch sorry I missed that. Updated now, wdyt?

vsoch commented 3 months ago

Nice! Let's run these tests now.

isinyaaa commented 3 months ago

Oops, I apologize for the dumb mistake, updated now @vsoch .

isinyaaa commented 3 months ago

@vsoch can you reapprove tests? I ran the linter locally to make sure it's working, sorry for the trouble

isinyaaa commented 3 months ago

@vsoch not sure what went wrong with those tests... maybe some issue with the generated file size? from the raw logs I can't spot any problems, neither locally.

vsoch commented 3 months ago

Here is what I see:


/bin/bash scripts/test.sh
ORAS_PORT: 5000
ORAS_HOST: localhost
ORAS_REGISTRY: localhost:5000
ORAS_AUTH: 
============================= test session starts ==============================
platform linux -- Python 3.11.9, pytest-8.3.2, pluggy-1.5.0
rootdir: /home/runner/work/oras-py/oras-py
configfile: pyproject.toml
collected 22 items
oras/tests/test_oci.py .
oras/tests/test_oras.py .sSuccessfully pushed localhost:5000/dinosaur/artifact:v1
Successfully pushed localhost:5000/dinosaur/artifact:v1
.Successfully pushed localhost:5000/dinosaur/artifact:v1
.Successfully pushed localhost:5000/dinosaur/artifact:v1
..Successfully pushed localhost:5000/dinosaur/directory:v1
.s
oras/tests/test_provider.py Successfully pushed localhost:5000/dinosaur/artifact:v1
Successfully pushed localhost:5000/dinosaur/artifact:v1
0+0 records in
0+0 records out
0 bytes copied, 3.8763e-05 s, 0.0 kB/s

I would try to reproduce locally.

isinyaaa commented 3 months ago

@vsoch as expected, while testing on my fork I found the problem lies when working with those very large files on GHA (workflow run on my modified main). It worked when I reduced the test file size to be a couple times the default chunk size, wdyt?

vsoch commented 3 months ago

Should the chunk size perhaps be smaller then?

isinyaaa commented 3 months ago

I don't really think that's a problem, up to you. The issue was in creating a 15GB test file in github actions. I think the worker didn't have this much space to spare or something. I reduced the test file to be 4x the default chunk size (4x 16MB), that's all.

vsoch commented 3 months ago

Ah ok. Please keep the tests in GitHub the same as what you are doing (and what is working) locally and let's try making more space on the builder - these first three lines before the tests to cleanup and add space should be sufficient:

https://github.com/converged-computing/fluxnetes/blob/0d577aa3155e68aff457f390f8c926e9f57a13d3/.github/workflows/e2e-test.yaml#L129-L133

But you can add more as needed.

isinyaaa commented 3 months ago

@vsoch updated! thanks for the pointer, I had no idea the default images could be so huge! Though the first three lines didn't suffice as we actually need 15GB for the file + 15GB for the upload.

isinyaaa commented 2 months ago

Hey, @vsoch, can we merge this yet?

vsoch commented 2 months ago

Yes - we are close! Can you please bump the version in oras/version.py and make a note about the change in CHANGELOG.md? That should be the final bit we need.

isinyaaa commented 2 months ago

updated, wdyt?