aws / apprunner-roadmap

This is the public roadmap for AWS App Runner.
https://aws.amazon.com/apprunner/
Other
292 stars 13 forks source link

CORS related issues happening on random #220

Open st3fus opened 7 months ago

st3fus commented 7 months ago

Hello, is there a way to disable the App Runner scaling to zero? I'm having issues with it in production, because it often happens people don't use the application for some time and Backend running on AWS App Runner scales down to 0 and once a user opens the app after the backend scaled to 0, they get a bunch of CORS errors until it goes back to 1 again, it's pretty annoying and I'd like to have that as an option?

Thanks

hariohmprasath commented 7 months ago

Hi @st3fus, In App Runner we don't scale down the service to zero if we don't receive any traffic. Its one of the requested feature in our roadmap #9 which is currently not implemented. Can you add some logging to your CORS filter to understand where the request is failing? In CORS or application layer

epomatti commented 7 months ago

@st3fus let me share my own experience with the service and see if this helps.

Based on a conference video I watched a while ago and some documentation, after some period without traffic, App Runner puts the vCores in an idle state, but keeps the container(s) warmed-up in memory. Once an HTTP request hits an endpoint in your App Runner, the container will be put back to work almost instantaneously, there is no cold start. We've been using it for a while with no issues in that regard.

The benefit here of course is cost reduction, since you'll pay only for the memory you allocate, but pay for vCores only for the amount you actually use.

The issue is that your background processes will stop when the application is in an idle state. Assuming that App Runner is working correctly, it is possible that background processes in your application are failing due to this App Runner behavior. Try troubleshooting this. Or, if you need background processes, you should consider a different host option.

This idle state is poorly documented in my opinion, I couldn't find it anywhere, and it should be highlighted.

Last point is, I don't know why there is no option to disable the idle behavior for App Runner. In my experience almost all solutions have some sort of background processes that eventually get implemented.

chai3 commented 7 months ago

I think the background process in an idle state never stop but throttled. I mesured that CPU-throttled instance in an idle state was about 0.01 vCPU.

https://aws.amazon.com/apprunner/faqs/

If your application receives no incoming requests, App Runner will scale the containers down to a provisioned instance, a CPU-throttled instance ready to serve incoming requests within milliseconds.

st3fus commented 6 months ago

I'm still having issues with CORS errors occuring randomly. Mostly, after the app hasn't been active, but it does also happen when it is active. Usually, few refreshes fixes everything, but still, I can't be having that. Here are some screenshot when it happens. Only Python/Django backend is running there and my webserver with nginx and frontend/admin panel is on another EC2.

image image image

image

P.S. Nothing shows up in Backend (app runner) logs or nginx logs related to this.

st3fus commented 6 months ago

It's definately due to this stupid feature of yours where it throttles down to 0.01 vCPU, everytime the requests hit 503 and 404, the instance is turning back up and I can see the initial launch of backend again...

epomatti commented 6 months ago

@st3fus not sure if this helps, but I had "fake" CORS errors show up before in other projects and they were network issues. I'll share this blog post I found but I'm sure there's more.

To apply a workaround immediately, I would use a health probe or Lambda to keep App Runner from going into an idle state. Probably Route 53 health checks are the easiest option. Assuming this works, you can focus on the real issue.

Next, I would isolate the problem and try to confirm if the idle mechanism might be triggering any abnormal behavior in your application. Just one example to illustrate my point: during the time the application is idle, database connections could be expiring in the pool, and when it comes back from the idle state, it takes a few seconds for the pool to get refreshed with healthy database connections. Maybe troubleshooting this in your test environment with debug logs will help determine if any components are being affected, thus ending up generating the CORS issues you're seeing.

To get direct technical support from Amazon then you should submit a support case, you'll need to subscribe to one of their paid support plan for technical assistance. There is no SLA currently for App Runner.

st3fus commented 6 months ago

@epomatti Thanks for the reply. I'm 99% sure that its because of the idling state. In the meantime, while I was waiting for a response here, I've created a seperate EC2 instance, pointed the DNS records to the new public IP of the said instance and brought up the backend. For the past few hours, everything is going smoothly and it even works faster. I used an equivalent instane of 2vCPUs and 4GB RAM as I did in the App Runner config

I also have a staging env, with similar domain to that, the only difference is that backend, frontend, admin panel and nginx are all on the same instance, with separate subdomains. That works like a charm as well