Closed OmarIthawi closed 2 years ago
The main problem with this approach is that it kills in-flight requests. If you stop the nginx process, it does a graceful shutdown where it immediately stops accepting new requests, then waits for existing ones to complete, then actually shuts down. Just cutting things off at the firewall level just blocks packets and anything that's currently in-flight can be cut off partially complete, or it can complete on the backend/django side but not be able to return a response.
We'd also need to test what the LB actually does. When it tries to healthcheck on a VM where nginx is stopped, it gets an immediately failure response. Dropping packets with the firewall usually looks more like a slow server and you have to wait for the TCP packets to time out (or some other timeout that the LB might have) before you can definitively say "this backend is unhealthy and needs to be removed from the pool".
The main problem with this approach is that it kills in-flight requests. If you stop the nginx process, it does a graceful shutdown where it immediately stops accepting new requests, then waits for existing ones to complete, then actually shuts down. Just cutting things off at the firewall level just blocks packets and anything that's currently in-flight can be cut off partially complete, or it can complete on the backend/django side but not be able to return a response.
This is the expected behavior, and yes it's not acceptable.
I spent 30 minutes trying to verify this behavior without luck to block 443 from staging-tahoe-us-juniper-edxapp-1
.
I'm going to stop experimenting here because this no longer the low hanging fruit I was hoping to achieve.
🏳️
This is a proposal to remove nginx from load balancer without risking breaking it.
Status
This pull request is mostly to discuss the idea. We've had a ton of proposals and discussions so far on
This is my last attempt to find a low hanging fruit to reduce the impact of GSFuse hanging during deploys.
The only other viable solution currently is: https://appsembler.atlassian.net/l/cp/JMywXNn2
Checklist