deis / registry

Docker registry for Deis Workflow.
https://deis.com
MIT License
16 stars 24 forks source link

Pushing bigger images crashes registry/minio/controller/builder #62

Closed jolly2 closed 7 years ago

jolly2 commented 8 years ago

Hello, I have an image which is about 932M in size. When I try to push it to deis registry, it takes more than 15 mins and after that brings down registry/minio/controller/builder and k8s tries to restart them all at the same time. This makes the system unresponsive and some times soft locks the CPUs. This makes me to hard boot the whole cluster. As minio does not have persistent storage, i would have to repeat the pushing all over again. I tried chunking my big image into smaller ones. Instead of pushing one big 932M image, I pushed multiple layers of images to deis registry (manually). Still, when pushing the last image layer, registry crashes, a minute after that minio crashes which brings down builder and controller in turn. There is no problem with the image as I have tested the image directly on docker and also through k8s. The container builds and runs perfectly fine without Deis.

Using 3 node virtualbox deis cluster with 12 core cpu, 125G RAM, 1TB hd/Deis workflow 2.0.0

Thanks!

bacongobbler commented 8 years ago

Do you have logs for the registry and for minio to show how it brought down the cluster?

jolly2 commented 8 years ago

After hard booting the cluster, I am trying to see if there are any stopped docker containers (specifically for minio and registry) with docker ps -a . I see only the containers that came up after the machine is started, but not the containers that were running before the hard boot. Is there any other place that I can look for deis logs?

bacongobbler commented 8 years ago

Use kubectl logs -p to see the logs from the previous instance of the container in the pod.

jolly2 commented 8 years ago

Found the root cause of the problem. The reason was, the underlying storage could not cope up with the high rate of write activity of deis registry. This was crashing all the containers (include deis's) to restart by k8s. Used the underlying storage throttling mechanism to solve this problem. Does deis have any road map to provide any kind of overlay storage mechanisms to handle and distribute images that also handle throttling during high write activity to different underlying storages?

bacongobbler commented 8 years ago

That would probably be a good PR for upstream! I'm not sure if we'd support that here as we're trying to stick with official upstream registry images, but if it were PR'd upstream then we could take advantage of those mechanisms.

bacongobbler commented 7 years ago

I don't believe this has been an issue any more. Closing for now but please re-open if it still persists on v2.8.0+.