tailhook / vagga

Vagga is a containerization tool without daemons
http://vagga.readthedocs.org
MIT License
1.86k stars 96 forks source link

DockerImage build step #572

Open anti-social opened 2 years ago

anti-social commented 2 years ago

Implemented:

anti-social commented 2 years ago

@tailhook Could you review the PR

anti-social commented 2 years ago

Other than a bit tricky async code and the absense of docs look good.

I'm considering not doing а concurrent download of docker layers.

tailhook commented 2 years ago

Other than a bit tricky async code and the absense of docs look good.

I'm considering not doing а concurrent download of docker layers.

Why? FuturesUnordered are fun to use, I think you just have to play with it a little bit to make yourself comfortable. Technically you can also use FuturesUnordered like spawn (i.e. use push), but that code might be too imperative :)

anti-social commented 2 years ago

With FuturesUnordered coroutines are launched without an order. So there is a possible situation when the first layer will be downloaded last. As a result an unpacking won't be executed concurrently with layers downloading.

I've tested a concurrent downloading a little. Large images become ready for use faster when their layers are downloaded in a sequence. Especially when the first layer is huge.

anti-social commented 2 years ago

I would rather use mpmc channel for a concurrent layers downloading. But I think it's redundant as the concurrent downloading cannot significantly increase total downloading speed due to limited network bandwidth.

tailhook commented 2 years ago

With FuturesUnordered coroutines are launched without an order. So there is a possible situation when the first layer will be downloaded last. As a result an unpacking won't be executed concurrently with layers downloading.

I've tested a concurrent downloading a little. Large images become ready for use faster when their layers are downloaded in a sequence. Especially when the first layer is huge.

It's the same issue with spawned futures :) Just add a way to sync this. Something like this:

let channels = (0..layers.len()+1).map(|_| oneshot::channel()).collect::<Vec<_>>();
let tx = channels.map(|(tx, _)| tx);
let rx = channels.map(|(tx, _)| rx);
tx.remove(0).send(|()| ()); // Activate first one
layers.zip(tx).zip(rx).map(|((l, tx), rx)| async {
  download().await;
  rx.recv().await;
  unpack().await;
  tx.send(()).await;
}).collect::<FuturesUnordered>().await;

(and maybe also add a semaphore to limit number of simultaneous downloads)

But I think it's redundant as the concurrent downloading cannot significantly increase total downloading speed due to limited network bandwidth.

Downloading in parallel to unpacking, and also setup cost on small files could make it not as clean. But yes, we can try simpler implementation first. Also indicatif is quite useful on big files.

anti-social commented 2 years ago

It's the same issue with spawned futures

Yeah, I know the first implementation also could download image layers out of order.

Also indicatif is quite useful on big files.

Progress bar is on the todo list. But I decided to do it later. Guess I can implement it in this PR.

anti-social commented 2 years ago

Updated documentation.

We need to know size of a layer to display a downloading progress. But with current dkregistry api it's not possible to get content length of the layer. I'll think about it later.

anti-social commented 2 years ago

@tailhook Could you review it and merge if all is OK

anti-social commented 2 years ago

The certificate for ubuntu.zerogw.com expired on 4/23/2022.