Closed EddieLF closed 7 months ago
I don't believe memory is the bottleneck here - I suspect we're actually misconfiguring Django on deploy (similar to how we misdeployed seqr).
Noting cloudrun instance is deployed with: 1 CPU, 80 concurrent requests, 100 max instances, but in gunicorn here we're asking for 2 workers and 4 threads. I think we should switch workers to 3 gunicorn docs, and 1 thread (as gcloud is not hyperthreaded) and see what happens:
@illusional thanks for pulling up the plots which show that peak memory utilisation is not even at 50% while set to 512mb. I agree with the interpretation of the gunicorn docs and that those changes to workers and threads are a good place to start. PR updated with a change to the Dockerfile and a comment linking the docs.
The current deployed cloudrun instance for the curation portal only has 512Mi of memory. As the number of variants in the project scales, the curators are facing long delays in loading projects, variants, charts, etc. ~Doubling the memory with the --memory flag may alleviate some of these issues.~
As suggested below by Michael, more memory may not be the solution. We should start by correcting the number of workers
=(2x threads) + 1
and threads=1
(as gcloud is not hyperthreaded) and see if there is a noticeable improvement.