Closed jolly2 closed 7 years ago
Do you have logs for the registry and for minio to show how it brought down the cluster?
After hard booting the cluster, I am trying to see if there are any stopped docker containers (specifically for minio and registry) with docker ps -a . I see only the containers that came up after the machine is started, but not the containers that were running before the hard boot. Is there any other place that I can look for deis logs?
Use kubectl logs -p
to see the logs from the previous instance of the container in the pod.
Found the root cause of the problem. The reason was, the underlying storage could not cope up with the high rate of write activity of deis registry. This was crashing all the containers (include deis's) to restart by k8s. Used the underlying storage throttling mechanism to solve this problem. Does deis have any road map to provide any kind of overlay storage mechanisms to handle and distribute images that also handle throttling during high write activity to different underlying storages?
That would probably be a good PR for upstream! I'm not sure if we'd support that here as we're trying to stick with official upstream registry images, but if it were PR'd upstream then we could take advantage of those mechanisms.
I don't believe this has been an issue any more. Closing for now but please re-open if it still persists on v2.8.0+.
Hello, I have an image which is about 932M in size. When I try to push it to deis registry, it takes more than 15 mins and after that brings down registry/minio/controller/builder and k8s tries to restart them all at the same time. This makes the system unresponsive and some times soft locks the CPUs. This makes me to hard boot the whole cluster. As minio does not have persistent storage, i would have to repeat the pushing all over again. I tried chunking my big image into smaller ones. Instead of pushing one big 932M image, I pushed multiple layers of images to deis registry (manually). Still, when pushing the last image layer, registry crashes, a minute after that minio crashes which brings down builder and controller in turn. There is no problem with the image as I have tested the image directly on docker and also through k8s. The container builds and runs perfectly fine without Deis.
Using 3 node virtualbox deis cluster with 12 core cpu, 125G RAM, 1TB hd/Deis workflow 2.0.0
Thanks!