AlexsLemonade / resources-portal

https://resources.alexslemonade.org
BSD 3-Clause "New" or "Revised" License
1 stars 1 forks source link

Deploy Not Completing #986

Open davidsmejia opened 1 year ago

davidsmejia commented 1 year ago

Context

Recently ran a deploy to production to verify if cron jobs were being correctly initiated on deploy and confirmed they are not. This is most likely due to the api-server-instance-user-data.tpl.sh script not completing or errorring out at some point during execution. Additionally the api docker image did not start on deploy.

In order to manually fix:

Solution or next step

davidsmejia commented 1 year ago

This issue is currently happening on staging though now it is erroring out at around apt update.

arkid15r commented 1 year ago

After inspecting the logs I came to a conclusion that the issue could be caused by a transient network error (Network is unreachable) or/and deb package mirrors error (Connection refused, Service Unavailable). This resulted in an incomplete package installation making awscli and certbot unavailable.

I ran the following commands to return the box into a usable state:

rm /var/log/cloud-init.log \
&& rm -rf /var/lib/cloud/* \
&& cloud-init -d init \
&& cloud-init -d modules --mode final
arkid15r commented 1 year ago

In order to get this closed we need to make sure the deploy process works fine. To do that the 1password issue needs to be resolved first.