Open lukehutch opened 3 months ago
I did not experience this when I set up and tested the Terraform scripts. Are you using GCP? Did you do any modifications to the scripts that can potentially affect this?
I am using GCP, and I have neither made changes to the terraform scripts, nor manually configured anything.
This is what these errors look like, for the record:
statusCode = 502, ServerpodClientException: Unknown error, data:
<html><head>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
<title>502 Server Error</title>
</head>
<body text=#000000 bgcolor=#ffffff>
<h1>Error: Server Error</h1>
<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>
<h2></h2>
</body></html>
They can last 5 or 10 minutes after deployment.
@vlidholt I really need a zero-downtime way to deploy server updates. Currently server unavailability ranges from 5-15 minutes after deploy from GitHub. Occasionally it takes much longer than that. Is there anything that can be tweaked in the Terraform scripts to reduce or eliminate downtime?
@vlidholt two hours after the last server deploy, I am left without any VM instances at all!
This is a very serious problem... how do I get to the bottom of what went wrong?
When I run the deploy workflow from GitHub, I get 502 server errors for a few minutes when trying to connect to the server from my app.
The deploy workflow is supposed to leave the old server running until the new server has finished starting, so this shouldn't happen. (There's a Terraform setting for this, I can't find it right now, but I saw before that that option was set...)
This means I have to be very strategic about when I restart my server (based on the time when the fewest users are online), even just to update website content :-( This is not a good situation.
Also, I wonder if this will affect the autoscaler...
Is there a policy change that can be made to minimize or eliminate downtime?