Closed itay closed 5 years ago
https://github.com/gravitational/planet/pull/433 Should address this issue.
It seems that the installer is able to race the startup of the registry, and sometimes the registry isn't starting immediately. On my system this happened when docker took a long time to start. So as a workaround I removed the dependency that the registry has on docker starting. Although this should also be addressed in the installer itself.
That’s great. Do you think it’ll be ported to the 5.5.x line as well?
Yea, since it looks like it's happening on older versions, I'll backport it.
Any updates on this? We run into this pretty deterministically on larger instances with more CPUs.
The workaround has been backported to the active maintenance branches, so it's just a matter of checking in with the team. I can try and push out the release tomorrow if there's nothing else pending.
Thanks - it seems like the only commit pending in the 5.5.x branch. If there's a published release of Planet somewhere with it, I am happy to give it a shot and see if it fixes the problem prior to cutting the release.
No I don't think planet has been tagged, and I'm pretty sure the workaround appears to be working on the 6.0 branch. It is possible to just build planet, and include it in an application using the custom base image in gravity: https://gravitational.com/gravity/docs/pack/#user-defined-base-image
@itay sorry for the delay, I got kind of swamped yesterday. I've released 5.6.5 / 5.5.13 which have the workaround for this issue.
Excellent - we will test it. Thank you so much!
Describe the bug Periodically,
gravity install
fails in the populate docker registry step. Here is the sample CLI output:This is running on EC2 instance, stock Amazon Linux 2 AMI, with 120GB disk. Here is the
cluster-config.yaml
file:I've included the log files for gravity-install, gravity-system and
journalctl
from within the Planet container, as well as the one from the actual system (journalctl_system.log
).To Reproduce
Nothing really - I run the exact same steps (this is more or less scripted for us), and it succeeds 19 times out of 20. Typically I can just re-run the set up (after running
./gravity leave --force
to clean up), and it just works on the same machine.Expected behavior
It should work every time.
Logs
gravity-install.log gravity-system.log journalctl.log journalctl_system.log
Environment (please complete the following information):
Additional context