Open tommydongaws opened 1 year ago
"Min size" refers to the minimum number of instances that App Runner provisions for your service. The service always has at least this number of provisioned instances. By default, this value is 1 and can be increased by configuring MinSize setting. So the number of instances don't go 0. The only case when they go to 0 is when you pause a service and when you resume, they come back to minimum number of instances in the service. Beyond minimum size, the service scales based on traffic upto MaxSize configured on your service. When there is no traffic, the ActiveInstances metric shows 0 since there are no active instances which are processing requests but there will always be minimum number of instances ready to serve the traffic when it comes. Please read more details here - https://docs.aws.amazon.com/apprunner/latest/dg/manage-autoscaling.html
Yes, that is what I said, the MinSize" is just for minimum provisioned instances not active instances.
From what I understand it goes like this: There is traffic -> ActiveInstances is 0 -> Provisioned instances gets unthrottled -> Active instance back to 1
I'm wondering if it is possible to leave one instance to stay active so that there won't be a delay.
I'm curious about this behavior too. We've observed occasional "cold start" looking behavior where requests take up to 10-20 seconds or timeout as a 503 and the apprunner metrics show 0 instances running at that time. Typically the service then starts to respond after a minute or so.
@rsharrott Could you please share service ARN for your App Runner service and timestamps when you saw high latencies/503s for us to take a look further? Thanks.
@amitgupta85 We also received downtime notification from our monitoring tool when App Runner scaled down to 0 instances, due to this we have moved few of our critical services to Fargate. I would share the ARN post consulting my team.
I think this really depends on the type of application. I notice on the doc, this should take fractions of a second to unthrottle. Though this can still indirectly cause a long delay.
For example:
In that scenario there would be a ~2 second delay because of the throttle. Moreover, it would be a consistent 2 second delay if the site only get traffic once in a while.
As applications get much bigger, I'm sure there would be more issues caused by being throttled. It would be more convenient for a "minimum available instances" setting.
@amitgupta85 I saw this issue today on my AppRunner instance when deploying an update.
For a few moments, my production app was throwing 503 errors.
My ARN is arn:aws:apprunner:us-east-1:721523162075:service/Venrollment_Prod_v2/21955eb3f78442ba9d0d2d0f9e22a6c1
Community Note
Tell us about your request What do you want us to build? Currently there is no minimum available instances setting since the current"MinSize" is just for minimum provisioned instances. This could be a blocker for production services since there is a possibility there there are 0 guaranteed available instances. The customers could potentially get a 5xx error if it scaled to 0. I would like for there to be a "minimum available instances" setting so that there could be a way to prevent it from scaling to 0.
Describe alternatives you've considered The current workaround seem to be just using a script to curl the server every few minutes but this is not ideal and users could get confused with "MinSize"