Open jobara opened 1 year ago
This ticket captures the following tickets as related sub-tickets: https://github.com/accessibility-exchange/platform/issues/1728 https://github.com/accessibility-exchange/platform/issues/1686 https://github.com/accessibility-exchange/platform/issues/1550
@jobara - We understand that this is not a priority at this time. Is that right? The sense of our team is that we can turn off the rolling updates, but then we will have downtimes for each deployment. This might not be worth our time.
Do you agree?
@colleenskemp I'll have to think so more on this. I'll check in with @michelled when she's back.
At the dev check in meeting with @JureUrsic, @peterhebert, and @michelled we discussed using Laravel's maintenance mode for this. When the deploy is happening the script would call php artisan down
, after the deploy is finished it would call php artisan up
. Any users accessing the site during the maintenance time would see a maintenance page.
@JureUrsic I was thinking about this today, and wondering when/where it should run. I was thinking it could go around the migration step in DeployGlobal.php but I'm not sure because wouldn't the old web head need to come down before we take the site out of maintenance mode? Also are you able to take on work on this task?
@jobara it should go into "local" command on start and beginning
I can run some tests on dev, just give me the commands to run
I can run some tests on dev, just give me the commands to run
@JureUrsic thanks, you can use the php artisan down
and php artisan up
commands. See Laravel's maintenance mode for more information.
@JureUrsic the other day I manually reset the database in the dev deploy. As part of that I put the site in maintenance mode. However, after bringing the site back up using php artisan up
the site was removed from maintenance mode; however, for several minutes the site remained inaccessible and returned a 500 error from nginx I believe. So the site actually looked broken for awhile. I'm not sure if this will happen with the plans we have for this ticket, but something to look into along with it.
@marvinroman
So the problem with maintenance mode currently is that the health check on the pods also gets maintenance mode so the pod is considered unhealthy and the load balancer doesn't forward connections.
We will take the following actions to fix:
php artisan down/up
in the php artisan deploy:global
command.@jobara I've made the necessary changes in the branch associated with this issue. Let me know if you want me to create a PR for it?
@marvinroman thanks for working on this. Yes, please file a PR for the changes.
So the problem with maintenance mode currently is that the health check on the pods also gets maintenance mode so the pod is considered unhealthy and the load balancer doesn't forward connections.
We will take the following actions to fix:
- [ ] Create a health check that will bypass maintenance mode.
- [ ] Put the
php artisan down/up
in thephp artisan deploy:global
command.
Regarding the health check, in taking a glance at your branch, it looks like it checks the DB now. But I guess that won't really tell us if the web site is actually served up properly. Is there a way to check different things if the site is in maintenance mode or not?
Regarding turning maintenance mode on/off in the global deploy, will that affect the original instance as well and not just the two new ones that are in the process of spinning up?
@marvinroman also in your branch I noticed that it brings the site back up after 5 minutes. These kinds of timers are always risky as we don't know if the task has yet to complete or completed some time before. Is it possible to get a hook into when the pods are actually being used, and/or when the old pods are all removed?
So the problem with maintenance mode currently is that the health check on the pods also gets maintenance mode so the pod is considered unhealthy and the load balancer doesn't forward connections. We will take the following actions to fix:
- [ ] Create a health check that will bypass maintenance mode.
- [ ] Put the
php artisan down/up
in thephp artisan deploy:global
command.Regarding the health check, in taking a glance at your branch, it looks like it checks the DB now. But I guess that won't really tell us if the web site is actually served up properly. Is there a way to check different things if the site is in maintenance mode or not?
Regarding turning maintenance mode on/off in the global deploy, will that affect the original instance as well and not just the two new ones that are in the process of spinning up?
This is a health check of the pod and not the site to know whether to forward connections to the pod from the load balancer. In other words are the services properly running. We have an external check that determines site health and will notify us of site issues.
When maintenance mode is activated it occurs across all the pods.
I agree that there are risks associated with a timer, but we haven't found an alternative at this time.
We have determined that lifecycle hooks aren't possible to use in our infrastructure at this time.
Prerequisites
Describe the bug
In our current rolling deploy system, as new pods are being deployed an old pod sticks around until the new ones are ready for use. However, there is a single shared database that the pods connect to. The issue here is that a user may be interacting with the old pod, but the database could have been migrated to a new structure. This could lead to data corruption and/or 500 errors reported to the user as the application may have a mismatch of expectations of the data compared to the current database.
Expected behavior
We should minimize or eliminate the possibility of the old application and new database from interacting with each other.