Open betatim opened 5 years ago
Hmmm, kubespawner will listen for events etc, perhaps we can derive from KubeSpawner and look for how long time is spent in "pulling image" or similar and then log it somewhere? That is my 10 second intuition idea, probably want to discard it soon :p
We should check if the idea that only one image can be pulled at a time is correct.
My local docker CLI seems to be able to pull the same or different images in parallel. So if at all there is some locking in kubernetes.
kubelet
has --serialize-image-pulls
which defaults to true. This makes me think we do serialize pulls.
It probably makes sense though, because pulling 1/5 of 5 images is still 0 images completed, hmmm....
@saulshanabrook voiced some interest in helping us figure out why launches take as long as they do so we can then improve on it.
Right now the hypothesis is that pulling the images onto a node is what takes the vast majority of time. However I don't think we have any data to back that up. So maybe the first step would be to instrument things so we can generate some data.
I wanted to open this issue to start the ball rolling. Off the top of my head I don't have a good idea on a concrete actionable first step beyond "look into how we could instrument things".
There are several ways we could tackle the problem of "image pulls take long" problem, before we dive into those and get excited we should generate some data though.