Open davidsmejia opened 1 year ago
This issue is currently happening on staging though now it is erroring out at around apt update.
After inspecting the logs I came to a conclusion that the issue could be caused by a transient network error (Network is unreachable) or/and deb package mirrors error (Connection refused, Service Unavailable). This resulted in an incomplete package installation making awscli and certbot unavailable.
I ran the following commands to return the box into a usable state:
rm /var/log/cloud-init.log \
&& rm -rf /var/lib/cloud/* \
&& cloud-init -d init \
&& cloud-init -d modules --mode final
In order to get this closed we need to make sure the deploy process works fine. To do that the 1password issue needs to be resolved first.
Context
Recently ran a deploy to production to verify if cron jobs were being correctly initiated on deploy and confirmed they are not. This is most likely due to the
api-server-instance-user-data.tpl.sh
script not completing or errorring out at some point during execution. Additionally the api docker image did not start on deploy.In order to manually fix:
sudo docker remove resources_portal_api
sudo ./start_api_with_migrations.sh
api-server-instance-user-data.sh
Solution or next step
api-server-instance-user-data
script is indeed erroring out and where