Cadasta / cadasta-platform

[DEPRECATED] Main repository of the Cadasta platform. Technology to help communities document their land rights around the world.
https://demo.cadasta.org
GNU Affero General Public License v3.0
53 stars 81 forks source link

Update uwsgi.ini #2030

Closed alukach closed 6 years ago

alukach commented 6 years ago

Proposed changes in this pull request

Why I made this change

The production server had run dangerously low on memory:

ubuntu@platform-prod-1:~$ free -m -h
             total       used       free     shared    buffers     cached
Mem:          2.0G       1.9G        65M       1.8M       776K        12M
-/+ buffers/cache:       1.9G        78M
Swap:           0B         0B         0B

Our uwsgi processes were the biggest offenders:

ubuntu@platform-prod-1:~$ ps -eo size,pid,user,command | sort -rn | head -10 | awk '{
> hr[1024**2]="GB"; hr[1024]="MB";
> for (x=1024**3; x>=1024; x/=1024) {
> if ($1>=x) { printf ("%-6.2f %s ", $1/x, hr[x]); break }
> } } { printf ("%-6s %-10s ", $2, $3) }
> { for ( x=4 ; x<=NF ; x++ ) { printf ("%s ",$x) } }
> '
348.42 MB 3749   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
347.88 MB 3753   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
311.04 MB 3751   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
275.47 MB 3750   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
263.26 MB 3747   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
256.54 MB 3746   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
242.01 MB 3752   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
216.46 MB 963    syslog     rsyslogd
189.14 MB 3748   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
181.92 MB 3744   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini

I don't know if this necessarily caused any user impact, I happened to notice it when I failed to be able to start a python shell process on the server.

I believe that we were suffering from the situation described in this blogpost, namely that our uwsgi processes had their memory footprints balloon and we were purging these processes way too infrequently.

After applying this change and restarting the uwsgi service, the memory usage dropped (as you could reasonably predict):

ubuntu@platform-prod-1:~$ ps -eo size,pid,user,command | sort -rn | head -10 | awk '{
hr[1024**2]="GB"; hr[1024]="MB";
for (x=1024**3; x>=1024; x/=1024) {
if ($1>=x) { printf ("%-6.2f %s ", $1/x, hr[x]); break }
} } { printf ("%-6s %-10s ", $2, $3) }
{ for ( x=4 ; x<=NF ; x++ ) { printf ("%s ",$x) } printf ("\n") }
'
216.46 MB 963    syslog     rsyslogd
160.76 MB 5593   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
159.85 MB 5589   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
81.87  MB 1029   cadasta    /opt/cadasta/env/bin/python /opt/cadasta/cadasta-platform/cadasta/manage.py sync_tasks -v 2
81.67  MB 5591   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
80.32  MB 5592   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
79.16  MB 5586   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
77.22  MB 5585   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
77.09  MB 5584   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini
66.79  MB 5590   cadasta    /usr/local/bin/uwsgi --ini cadasta.ini

Description of the change

By dropping the max-requests setting to 100, each uwsgi process will reload itself after serving 100 web requests (rather than 5000 requests as previously configured).

Being that the Cadasta Platform is served on machines with relatively low amount of RAM (2GB), it's important that we're pro-active about keeping memory footprints small.

How someone else can test the change

This is hard to test because it relates to the amount of data loaded in the handling of web requests. I'm not sure where the biggest offender for loading data into memory would be for our views. Possibly the /async/locations endpoint? I recommend that we watch the production server for a week or so to ensure that the memory usage stays reasonably low. After restarting uwsgi, the mserver was using only 730MB:

ubuntu@platform-prod-1:~$ free -m -h
             total       used       free     shared    buffers     cached
Mem:          2.0G       731M       1.2G       1.7M        27M       183M
-/+ buffers/cache:       520M       1.4G
Swap:           0B         0B         0B

When should this PR be merged

After feeling confident that the memory shortage does not return or before the next release is deployed (as the deployment of that release will undo the changes I manually applied).

Risks

The downside to having a low max-requests is that it's possible that our workers will be reloading themselves too often, causing interruption to service for the end-users. Being we serve few requests (less than 1/sec) and have a pool of 10 uwsgi processes, I don't think this will be an issue.

Follow-up actions

Watch memory usage on server.

Checklist (for reviewing)

General

Is this PR explained thoroughly? All code changes must be accounted for in the PR description.

Is the PR labeled correctly? It should have the migration label if a new migration is added.

Is the risk level assessment sufficient? The risks section should contain all risks that might be introduced with the PR and which actions we need to take to mitigate these risks. Possible risks are database migrations, new libraries that need to be installed or changes to deployment scripts.

Functionality

Are all requirements met? Compare implemented functionality with the requirements specification.

Does the UI work as expected? There should be no Javascript errors in the console; all resources should load. There should be no unexpected errors. Deliberately try to break the feature to find out if there are corner cases that are not handled.

Code

Do you fully understand the introduced changes to the code? If not ask for clarification, it might uncover ways to solve a problem in a more elegant and efficient way.

Does the PR introduce any inefficient database requests? Use the debug server to check for duplicate requests.

Are all necessary strings marked for translation? All strings that are exposed to users via the UI must be marked for translation.

Is the code documented sufficiently? Large and complex classes, functions or methods must be annotated with comments following our code-style guidelines.

Has the scalability of this change been evaluated?

Is there a maintenance plan in place?

Tests

Are there sufficient test cases? Ensure that all components are tested individually; models, forms, and serializers should be tested in isolation even if a test for a view covers these components.

If this is a bug fix, are tests for the issue in place? There must be a test case for the bug to ensure the issue won’t regress. Make sure that the tests break without the new code to fix the issue.

If this is a new feature or a significant change to an existing feature? has the manual testing spreadsheet been updated with instructions for manual testing?

Security

Confirm this PR doesn't commit any keys, passwords, tokens, usernames, or other secrets.

Are all UI and API inputs run through forms or serializers?

Are all external inputs validated and sanitized appropriately?

Does all branching logic have a default case?

Does this solution handle outliers and edge cases gracefully?

Are all external communications secured and restricted to SSL?

Documentation

Are changes to the UI documented in the platform docs? If this PR introduces new platform site functionality or changes existing ones, the changes must be documented in the Cadasta Platform Documentation.

Are changes to the API documented in the API docs? If this PR introduces new API functionality or changes existing ones, the changes must be documented in the API docs.

Are reusable components documented? If this PR introduces components that are relevant to other developers (for instance a mixin for a view or a generic form) they should be documented in the Wiki.