monarch-initiative / monarch-legacy

Monarch web application and API
BSD 3-Clause "New" or "Revised" License
42 stars 37 forks source link

URGENT: Load testing as part of release cycle #998

Open jmcmurry opened 9 years ago

jmcmurry commented 9 years ago

We urgently need to develop some load tests. We can all agree that the servers can not go down, especially when we have spikes of usage during after presentations. Who is the right person to take this on?

kltm commented 9 years ago

Load tests are important, but I think this probably needs to be unpacked into a few different items: unit tests sets, tests framework for pushing the services until they overload, fuzz testing, etc. Also, is it certain that it was an overload as opposed to some other type of issue, and which parts caused the load?

As well, since the desired outcome is to not have things go down, coordination with production about things like load balancing and failover options would go a long way to preventing disruptions.

mellybelly commented 9 years ago

I looked at google analytics and it seemed that there were only 70 counted users today, compared to over a hundred on most days recently. So either Google analytics hasn't finished counting, or it wasn't the load from us having presented.

kltm commented 9 years ago

Also, given the outages, there should be an external turnkey solution for restarting the servers. As well, something that we've had in AmiGO due to similar circumstances is a dead man's switch so that traffic can be redirected to a secondary site (e.g., sending people to beta would probably be better than a 500 error).

Part of the current architecture is that there is a single (now node) app running that contacts a bunch of services. If you allow a more robust system to sit in front of it (apache, nginx, etc.), even if the app goes down, you can have control and fallback solutions.