hackoregon / civic-devops

Master collection point for issues, procedures, and code to manage the HackOregon Civic platform
MIT License
11 stars 4 forks source link

Bump up memory for growing API containers #176

Closed MikeTheCanuck closed 6 years ago

MikeTheCanuck commented 6 years ago

Here's the current load on one of the EC2 instances, maybe a half-hour after a full refresh of the cluster:

CONTAINER ID        NAME                                                                                           CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
787ebac4478c        ecs-disaster-resilience-service-18-disaster-resilience-service-d29191f5fc94b0fb7700            0.07%               221.8MiB / 300MiB     73.94%              368kB / 6.56MB      378MB / 2.79MB      0
da278e9cba56        ecs-housing-service-16-housing-service-dcdfce9596a295f64d00                                    0.03%               59.2MiB / 100MiB      59.20%              224kB / 472kB       207MB / 0B          0
d22042aba46a        ecs-endpoint-service-17-endpoint-service-c6cbadb5a8b8d48d8701                                  0.00%               2.344MiB / 100MiB     2.34%               182kB / 325kB       29.2MB / 0B         0
48a0ea02acbb        ecs-neighborhood-development-service-7-neighborhood-development-service-86e6c2ac8998e0a34300   0.07%               231.6MiB / 300MiB     77.19%              833kB / 9.52MB      385MB / 2.87MB      0
08e4f76b02db        ecs-emergency-service-1-emergency-service-e6f5eea6fc838eee6400                                 0.03%               71.48MiB / 100MiB     71.48%              238kB / 2.77MB      427MB / 0B          0
c416bcbb3f48        ecs-budget-service-35-budget-service-eeceb8b9bcdbf4ed1400                                      0.03%               59.1MiB / 100MiB      59.10%              218kB / 1.13MB      173MB / 0B          0
eb1dcf8b24af        ecs-local-elections-service-13-local-elections-service-92b4afd18fb992830700                    0.07%               198.6MiB / 300MiB     66.19%              433kB / 12.2MB      225MB / 3.44MB      0
61791300f233        ecs-civic-lab-service-8-civic-lab-service-cee4e4ecb0deb0ddb901                                 0.00%               2.359MiB / 50MiB      4.72%               114kB / 168kB       29.2MB / 0B         0
4411d2d76cdc        ecs-civic-2018-service-8-civic-2018-service-fc86d3d88afaff83b601                               0.00%               22.83MiB / 100MiB     22.83%              220kB / 2.95MB      129MB / 0B          0
881a2cad2f88        ecs-housing-affordability-service-12-housing-affordability-service-cc9db296e6f5e5c47900        0.07%               336.1MiB / 400MiB     84.01%              24.3MB / 5.38MB     370MB / 1.13MB      0
12413f8619d8        ecs-civic-2017-service-9-civic-2017-service-fce0c9b8f7c8ed965c00                               0.00%               14.51MiB / 100MiB     14.51%              156kB / 269kB       115MB / 0B          0
6b20a30eb253        ecs-transportation-systems-service-4-transportation-systems-service-82d5a0d9a1da8b9d1e00       0.07%               225.5MiB / 500MiB     45.11%              565kB / 7.16MB      383MB / 2.79MB      0
8162ed1bf21f        ecs-homeless-service-15-homeless-service-d8ad97bfb8f983841800                                  0.03%               59.12MiB / 128MiB     46.19%              209kB / 1.8MB       170MB / 0B          0
21f9b110de2c        ecs-agent                                                                                      0.46%               14.02MiB / 7.792GiB   0.18%               0B / 0B             106MB / 147MB       0

The three services whose memory I've been watching - and incrementally bumping up - along the way are:

I'm going to add 100 MB to each of those, and keep monitoring for growth.

It's not that these Django apps necessarily need the extra memory - we've observed that the same application activity can live within more constrained memory limits (as if there's a dynamic allocator at work - I don't know, I'm just speculating) - but at least as a risk-mitigation strategy to make things as smooth as possible through Demo Day, an added 300 MB on an EC2 instance that is currently running with > 5 GB free (i.e. I've temporarily disabled last year's transportService) will still leave enough room for a full refresh (as we observed in #175).

MikeTheCanuck commented 6 years ago

And PR 42 has successfully been executed.