Closed dacook closed 1 year ago
Before running reports:
ofn-admin@production18:~$ ps -eo size,pid,user,command | egrep 'puma|sidekiq'
76700 11211 openfoo+ puma 6.2.2 (unix:///home/openfoodnetwork/apps/openfoodnetwork/shared/sock/puma.openfoodnetwork.sock) [2023-04-25-131137]
1914572 22139 openfoo+ sidekiq 7.0.9 2023-04-25-131137 [0 of 5 busy]
1498212 25734 openfoo+ puma: cluster worker 0: 11211 [2023-04-25-131137]
1703028 25793 openfoo+ puma: cluster worker 1: 11211 [2023-04-25-131137]
https://app.datadoghq.com/dashboard/bdw-2na-83i/openfoodnetworkorguk-cloned After report terminated:
After running report for 1 year (took extra memory but didn't reach system limit). We can see that Sidekiq has allocated the most memory, because it is now running the report.
ofn-admin@production18:~$ ps -eo size,pid,user,command | egrep 'puma|sidekiq'
76700 11211 openfoo+ puma 6.2.2 (unix:///home/openfoodnetwork/apps/openfoodnetwork/shared/sock/puma.openfoodnetwork.sock) [2023-04-25-131137]
1635444 13831 openfoo+ puma: cluster worker 0: 11211 [2023-04-25-131137]
1913708 13881 openfoo+ puma: cluster worker 1: 11211 [2023-04-25-131137]
4164016 16145 openfoo+ sidekiq 7.0.9 2023-04-25-131137 [0 of 5 busy]
Not good. There's an error in displaying all onscreen results, so we need to abort.
Also:
Rollback: I've disabled the feature toggle and restarted both sidekiq and puma. Memory usage is back to a normal amount. Tested to confirm:
I can see that the last two nightly puma restarts successfully reset the memory, so I think no further action required in the short term.
The UK server has had downtime, which seems to be a result of memory fully allocated. We were working around this by restarting Puma daily, although for the last week this didn't seem to help.
So let's hurry up and make use of the new background reports feature. It's still in progress, but will be an improvement. https://openfoodnetwork.slack.com/archives/C01T75H6G0Z/p1681840327630489
Prep
Deployment plan
Rollback plan