usecase: building patches on top of an OCI container for ML

d4l3k commented 2 years ago

I'm currently working on https://github.com/pytorch/torchx which is a project trying to make it easier to train and deploy ML models.

Quite a few of the cloud services / cluster tools for running ML jobs use OCI/Docker containers so I've been looking into how to make dealing with these easier.

Container based services:

Kubernetes / Volcano scheduler
AWS EKS / Batch
Google AI Platform training
Recent versions of slurm https://slurm.schedmd.com/containers.html

TorchX currently supports patches on top of existing images to make it fast to iterate and then launch a training job. These patches are just overlaying files from the local directory on top of a base image. Our current patching implementation relies on having a local docker daemon to build a patch layer and push it: https://github.com/pytorch/torchx/blob/main/torchx/schedulers/docker_scheduler.py#L437-L493

Ideally we could build a patch layer and push it in pure Python without requiring any local docker instances since that's an extra burden on ML researchers/users. Building a patch should be fairly straightforward since it's just appending to a layer and pushing will require some ability to talk to the registry to download/upload containers.

It seems like OCI containers are a logical choice to use for packaging ML training jobs/apps but the current Python tooling is fairly lacking as far as I can see.

@vsoch curious what your thoughts are and if that's something you'd be interested in having merged into this repo

vsoch commented 2 years ago

@d4l3k the oci-python here is typically not for an implementation of a container runtime or distribution, but it's a way to interact with the standards (e.g., image spec, digests, etc.)

Ideally we could build a patch layer and push it in pure Python without requiring any local docker instances since that's an extra burden on ML researchers/users.

okay so building an image without docker - and in Python! The latter (push and pull) from a registry is very easy / doable - that's how I first implemented being able to pull from docker hub down to singularity without needing docker. But actually building the image usually requires some dependencies - e.g., here is an example that uses runc / skopeo / umoci on the backend. one library that I know of implementing the spec to some degree, and in full Python, is Charliecloud - perhaps that would be a first shot to explore? I also started to work on oras-python but (on my own) I couldn't quite figure out the design, and started to mimic the Go code which was a mistake.

Building a patch should be fairly straightforward since it's just appending to a layer and pushing will require some ability to talk to the registry to download/upload containers.

I do agree that this particular need is fairly straight forward!

It seems like OCI containers are a logical choice to use for packaging ML training jobs/apps but the current Python tooling is fairly lacking as far as I can see.

And strangely this does appear to be the case, although I haven't searched fully. So how about this for a proposal - give Charliecloud a try to see if it can work to import and use for different functionality (and do some more searching to look for others, I'm surprised I couldn't find any in my quick search just now) and if that doesn't work, we can put together a little client. It probably doesn't even need to use oci python here, because really the oci libs are intended for Go where you need to define data structures and what not in advance. But if we are redundantly validating hashes and whatnot it wouldn't hurt! I think it would be a good opportunity for me to refactor a bit here. So TLDR: yes I'm definitely interested, but make sure that what you need isn't already out there, and probably we can create a new repository for this library.

d4l3k commented 2 years ago

Thanks for the quick reply! Good to know that push and pull is easy :)

When I say build I really mean tarball some files and add it as a layer to an existing manifest. I'm not looking to support anything as complex as Dockerfile.

The current ML packaging solutions tend to either rely on full Docker (which is a bit of a steep onboarding for ML researchers) or a more bespoke solution such as https://cloud.google.com/ai-platform/training/docs/packaging-trainer which is built around python packages and is also fairly clunky.

For a lot of ML jobs all you really need to do is to take a pre-existing container (such as https://hub.docker.com/r/pytorch/pytorch) and slap your model code on top of it and launch it to a cluster. Just supporting tarballs as a new layer is the minimum needed for that use case. For more advanced stuff a user would likely use a more full fledged tool like docker to build a new base image.

Thanks for the pointer to charliecloud, I hadn't seen that before

vsoch commented 2 years ago

Sure thing! Let me know if you want to work on something, definitely sounds fun :)

d4l3k commented 2 years ago

Playing around with it right now, definitely will share if I get something working!

d4l3k commented 2 years ago

got this working with a mishmash of interfaces. It's not pretty but it works.

It'll need some cleanup and might see what makes sense to merge back into this repo.

from opencontainers.distribution import reggie

import requests
import io
import tarfile
import hashlib
import os
import os.path
import json
import gzip

dst_name = "torchx"
dst_ref = "tristanr_patched"

src_endpoint = "https://ghcr.io"
dst_endpoint = "https://<id>.dkr.ecr.us-west-2.amazonaws.com"

src = reggie.NewClient(
    src_endpoint,
    reggie.WithDefaultName("pytorch/torchx"),
)

with open("ecr.passwd", "rt") as f:
    password = f.read()

dst_auth = ("AWS", password)
dst = reggie.NewClient(
    dst_endpoint,
    reggie.WithUsernamePassword("AWS", password),
)

req = src.NewRequest(
    "GET",
    "/v2/<name>/manifests/<reference>",
    reggie.WithReference("0.1.2dev0"),
)
resp = src.Do(req)
manifest = resp.json()
print(manifest)
layers = manifest["layers"]
config_digest = manifest["config"]["digest"]

def get_blob_raw(digest):
    req = src.NewRequest("GET", "/v2/<name>/blobs/<digest>", reggie.WithDigest(digest))
    req.stream = True
    return src.Do(req)

def get_blob(digest):
    return get_blob_raw(digest).json()

config = get_blob(config_digest)
print(config)

wd = config["container_config"]["WorkingDir"]

PATCH_FILE = "patch.tar.gz"
with tarfile.open(PATCH_FILE, mode="w:gz") as tf:
    content = b"blah blah"
    info = tarfile.TarInfo(os.path.join(wd, "test.txt"))
    info.size = len(content)
    tf.addfile(info, io.BytesIO(content))

def digest_str(s):
    m = hashlib.sha256()
    m.update(s)
    return "sha256:" + m.hexdigest()

def compute_digest(reader):
    m = hashlib.sha256()
    patch_size = 0
    while True:
        data = f.read(64000)
        if not data:
            break
        m.update(data)
        patch_size += len(data)
    patch_digest = "sha256:" + m.hexdigest()
    return patch_digest, patch_size

with open(PATCH_FILE, "rb") as f:
    patch_digest, patch_size = compute_digest(f)

with gzip.open(PATCH_FILE, "rb") as f:
    diff_digest, _ = compute_digest(f)

manifest["layers"].append(
    {
        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
        "size": patch_size,
        "digest": patch_digest,
    }
)

def blob_exists(digest):
    resp = requests.head(
        dst_endpoint + f"/v2/{dst_name}/blobs/{digest}",
        auth=dst_auth,
    )
    return resp.status_code == requests.codes.ok

def upload(digest, blob):
    if hasattr(blob, "__len__"):
        size = len(blob)
    else:
        size = os.fstat(blob.fileno()).st_size
    print(f"uploading {digest}, len {size}")
    resp = requests.post(
        dst_endpoint + f"/v2/{dst_name}/blobs/uploads/?digest={digest}",
        data=blob,
        headers={
            "Content-Length": str(size),
        },
        auth=dst_auth,
    )
    resp.raise_for_status()

def upload_manifest(manifest):
    resp = requests.put(
        dst_endpoint + f"/v2/{dst_name}/manifests/{dst_ref}",
        headers={
            "Content-Type": manifest["mediaType"],
        },
        data=json.dumps(manifest),
        auth=dst_auth,
    )
    if resp.status_code != requests.codes.ok:
        print(resp.content)
    resp.raise_for_status()
    print(resp, resp.headers)

with open(PATCH_FILE, "rb") as f:
    upload(patch_digest, f)

class ResponseReader:
    def __init__(self, resp):
        self.resp = resp
        self.mode = "rb"

    def read(self, n):
        return self.resp.raw.read(n)

    def __len__(self):
        return int(self.resp.headers["Content-Length"])

to_upload = [layer["digest"] for layer in manifest["layers"]]

config["rootfs"]["diff_ids"].append(diff_digest)
config_json = json.dumps(config)
config_digest = digest_str(config_json.encode("utf-8"))
upload(config_digest, config_json)
manifest["config"]["digest"] = config_digest

for digest in to_upload:
    if blob_exists(digest):
        print(f"blob exists {digest}")
        continue
    resp = get_blob_raw(digest)
    reader = ResponseReader(resp)
    upload(digest, reader)

upload_manifest(manifest)

docker run --pull always -i --rm <id>.dkr.ecr.us-west-2.amazonaws.com/torchx:tristanr_patched ls
tristanr_patched: Pulling from torchx
Digest: sha256:2b4ed5f899284e380ad5e5a421e0c8e8925e01ac4d0361d5d17423d3c78480f7
Status: Image is up to date for 495572122715.dkr.ecr.us-west-2.amazonaws.com/torchx:tristanr_patched
...
test.txt

vsoch commented 2 years ago

Thanks! I should be able to make some time this weekend.

vsoch commented 2 years ago

hey @d4l3k ! So I've started us a PR where we can hopefully address some of the challenges you faced. I'm not able to test the auth issues (so I'll need your insight / contribution for the PR) to fix any bugs that you might have found.

https://github.com/vsoch/oci-python/pull/17

For the example that you have above, I think we have two approaches that we can take. Either we provide more example in the docs (which I started to do, as some interactions above are just using the client fairly straight forward) but I'm also thinking it might make sense to provide (if not example) some kind of helper functions to do these standard interactions. So basically I could imagine some combination of:

more docs written with specific examples (what I started)
an examples/reggie folder that has script that demonstrate each basic need
an actual helper library / additional functions for reggie or some kind of wrapper for more advanced functionality.

Let me know your thoughts! Feel free to grab the branch and work on it, or have more discussion here.

d4l3k commented 2 years ago

While docs/examples are nice there's two main improvements that would be nice from the library:

1. Fix Basic Auth

Honestly to simplify you could just rip out the reggie retry logic and use requests username/password support. Not sure if there's anything special in there, the docker v2 spec doesn't say much about that.

2. Better Credential Handling

i.e. load usernames/passwords for the remote registries from the environment.

I'm currently using docker.from_env() but would love to replace it with a reggie.NewClient(..., reggie.WithEnvAuth()) that knows how to read from ~/.docker/config.json.

If they have a credential store configured that's not possible though

https://stackoverflow.com/questions/36022892/how-to-know-if-docker-is-already-logged-in-to-a-docker-registry-server

vsoch commented 2 years ago

I would be down for both of those (and I was hoping you'd be interested to contribute via a PR?) I think we could probably maintain the style of functions (e.g., With.X) but use requests on the backend to have a simpler approach. And the WithEnvAuth is a great idea - if we add here we can suggest to the upstream!

d4l3k commented 2 years ago

Yup, if I end up going this route for TorchX I'd be happy to submit a PR to fixup the auth stuff. Not sure on the exact timing for that, working on a bunch of stuff in parallel :)

Appreciate all the help on this! This project was a big help in getting the proof of concept impl done

johntellsall commented 10 months ago

@d4l3k @vsoch hi all! I have an alternate solution -- add files directly to an upstream OCI image.

AppendLayer

This standalone utility appends a tarball to an existing image in a container registry – without having to pull down the image locally.

It supports any registry that implements the OCI Distribution Spec.

Why The basic use-case for this utility is when you have a base image that is already available in a container registry, and you simply need to add one or more files, then push the result back to the same registry.

In this case, you can do no better in terms of network transfer than this utility. It does the minimum amount of work in order to get the job done.

https://pypi.org/project/appendlayer/

Their usecase is Apache Airflow. Mine is more general: I want Developers to be able to redeploy a Docker/Kubernetes image with the same nonchalance as saving a file.

Hope this helps!

johntellsall commented 10 months ago

Another alternate solution: Docker (BuildKit) already supports using COPY adding files to an image without downloading the image! I'll take a look at getting a demo of this.

adding capability that COPY layers can be rebased and reused via --cache-from even if cache for previous layers gets invalidated. All this works remotely with blobs in the registry. You can rebase an image on top of another image without the layers ever being downloaded or uploaded.

https://github.com/moby/buildkit/issues/2414

vsoch commented 10 months ago

Both of those sound amenable to me! Thanks @johntellsall

vsoch commented 10 months ago

adding files to an image without downloading the image! I'll take a look at getting a demo of this.

I do think I need a demo - I'm not sure how we would be adding files to the image without download it, haha. As in, we are creating a single local layer that would then be pushed with an updated manifest (and the assumption is that the previous layers already exist?) Is there a special flag for that?

vsoch / oci-python

usecase: building patches on top of an OCI container for ML #15

1. Fix Basic Auth

2. Better Credential Handling