ipfs-inactive / package-managers

[ARCHIVED] 📦 IPFS Package Managers Task Force
MIT License
97 stars 11 forks source link

Docker on IPFS #13

Open andrew opened 5 years ago

andrew commented 5 years ago

Documenting some of the different efforts to share and load docker images via IPFS

starship

Protocol project written in bash for managing containers (not just docker) with IPFS, doesn't look like it was ever published but @jbenet gave a talk about it in 2015: https://www.youtube.com/watch?v=vaIWRyotz4g

IPDR - IPFS-backed Docker Registry

New project (only two weeks old), written in Go that provides a cli and a http proxy server that conforms to the Docker registry HTTP API v2 spec which talks to an IPFS-backed Docker registry server.

Sidenote: Interesting slide in a docker presentation on the v2 API spec:

screenshot 2019-03-04 at 15 23 52

A quick skim through the spec, lots of similarities to IPLD: https://docs.docker.com/registry/spec/api/#content-digests

image2ipfs

Initially created 3 years ago, rewritten in Go just this past weekend!

Also provides an IPFS-backed http gateway, readme has some interesting notes:

Docker requires image names to be all lowercase which doesn't play nicely with base58-encoded binary.

There is WIP to move to base32 I believe: https://github.com/ipfs/ipfs/issues/337

Not sure. It would be great if an IPFS gateway could speak the Registry v2 protocol at /v2/* so you don't need to run a registry.

The tag is always "latest". The "real" reference is encoded in the name of the image, that is, the "dockerized hash". Even if you export myimage:my-tag, in the IPFS registry you always tag "latest" but an image like ciq9a3eafb835d79e/myimage:latest. This is very similar to how gx and gx-go work [https://github.com/whyrusleeping/gx].

Loading docker images directly from IPFS without a registry

Webcache of a blogpost that's no longer around, documenting a nice little hack to load and save images directly from docker, which can then be added to IPFS, pulled down again and then loaded directly into docker.

docker save 3f8a4339aadd > nginx.tar

ipfs add nginx.tar

ipfs get Qmbi6Y2aFG3SfjpzQgbu65on7EMFwusGqLYSPCGrcVRJ8A

docker load --input Qmbi6Y2aFG3SfjpzQgbu65on7EMFwusGqLYSPCGrcVRJ8A

docker tag sha256:3f8a4339aadda5897b744682f5f774dc69991a81af8d715d37a616bb4c99edf5 nginxipfs
lidel commented 5 years ago

Docker requires image names to be all lowercase which doesn't play nicely with base58-encoded binary.

CIDv0 Base58 → CIDV1 Base32:

ipfs cid base32 QmbWqxBEKC3P8tqsKc98xmWNzrzDtRLMiMPL8wBuTGsMnR
bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi
mikeal commented 5 years ago

Another approach would be to add a docker volume plugin https://docs.docker.com/engine/extend/plugins_volume/

This might allow us to mount an IPFS URL as the volume for the docker image, by-passing the need for registry and install all together. The main advantage to this being that we could mount the file system without all the data and load the data “Just in Time” for resources critical to boot and load the rest in the background while the person is using the container.

lanzafame commented 5 years ago

@mikeal In this particular case, I actually think that building a Docker registry that uses IPFS would be more beneficial than a volume plugin, in the context of deployment of images to cloud environments, whereas a volume plugin would be useful if we wanted developers to more easily integrate what they are running in the containers with IPFS.

On a separate note, I think to take full advantage of IPFS as an image distribution mechanism, we would want a chunker that understands the Docker image format so that chunks are along image layer boundaries. [OCI Image Spec]https://github.com/opencontainers/image-spec/blob/master/spec.md)

I mention OCI, as if feasible I think we should make this work applicable beyond just Docker but to all the other container/image implementations, i.e. rkt which implements appc Image spec, skopeo which uses containers/image.

Also, just discovered, this: https://github.com/opencontainers/go-digest, which is interesting.

Also as a reference, the OCI scope table: https://www.opencontainers.org/about/oci-scope-table

andrew commented 5 years ago

Uber just announced Kraken, "an Open Source Peer-to-Peer Docker Registry", written in Go.

It doesn't look like it uses IPFS or libp2p but they have this line in their blog post that caught my eye:

Kraken supports pluggable storage options, and instead of managing data blobs, Kraken plugs into reliable blob storage options like S3, HDFS, or another registry. The storage interface is simple, and new options are easy to add.

mikeal commented 5 years ago

I just got an intro to a person on the Kraken team. Who would be interested in having a call with them?

andrew commented 5 years ago

Comment by @anorth on slack about Uber's Kraken:

Response from Yiran Wang @ Uber:

We actually didn't look at many options in the beginning, because we initially wanted to do the same thing as Dragonfly - have a central component that distribute every 4MB blocks. IPFS actually looks very impressive, if we knew about it then, we would just use it, at least to start with.

Later we realized the central scheduling approach is not gonna work, and switched to a BitTorrent-like design, but at this point we also know we need something different from BitTorrent - for transferring 100G+ files, we need a way to rebalance the network periodically and randomly to achieve a random k-regular graph, and I don't think BitTorrent can do that (please correct me if I am wrong here). I found this Jellyfish paper last week, which is similar to what we want to do (but on higher layer). And in my opinion, this is where distribution within datacenter and on the internet might be different - we strive for a perfect topology within datacenter to reduce max download time for concurrent downloads, but this might not be a priority for downloads from the internet.

moritonal commented 5 years ago

Hi all, I've been working the last few weeks on implementing an IPFS Storage plugin for the Docker Registry. You can find the key parts of my work here and I've written a blog-article on this here. The benefits from shared layers through IPFS are insane if they were to hit critical mass, but the speed of bitswap discovery in IPFS seem pretty rough? I'd love to get more feedback on my work.

andrew commented 5 years ago

@moritonal thanks for sharing, have you seen https://github.com/hinshun/ipcs?

@dirkmc is starting to look into bitswap performance issues and @aschmahmann is working on making IPNS fast and reliable, both of which seem like they could help with some of the performance issues.

For content discovery speed, something like outlined in this blogpost might help? https://medium.com/pinata/speeding-up-ipfs-pinning-through-swarm-connections-b509b1471986

hsanjuan commented 5 years ago

Ups, I never mentioned in this thread, but I did implement an IPFS docker driver a few weeks ago too:

https://github.com/docker/distribution/pull/2906

dirkmc commented 5 years ago

@moritonal that's very interesting, thank you for putting this together 👍

Could you go into more detail about the performance issues you are seeing with bitswap? Did you do any profiling?