appsembler / configuration

a simple, but flexible, way for anyone to stand up an instance of the edX platform that is fully configured and ready-to-go
GNU Affero General Public License v3.0
15 stars 13 forks source link

tahoe: wait for nginx to start during rolling deploy #310

Closed thraxil closed 4 years ago

thraxil commented 4 years ago

on the last deploy we found that because of gcsfuse slowness, nginx can fail to start the first time. when that happens, ansible would go on to the next server in the list, stop nginx there, and we would effectively have no servers running nginx and the site would be down until the first one finally got started back up.

This adds three retries to the nginx start, waiting up to a minute between each attempt. If all three fail, the whole playbook should stop so we at least don't end up in the bad situation with no servers available.