Closed franck-boullier closed 5 years ago
The main change I made was optimise the DB connection. So now there is no explicit close and open, ~2s between function invocations.
Currently I'm struggling with https://enterprise.dev.unee-t.com/enterprise/menu.php
Currently I'm struggling with https://enterprise.dev.unee-t.com/enterprise/menu.php
is accessible for me. What seems to be the issue?
[2019-05-20T11:47:58+08:00] (ecs/bugzilla/bf7758eb-b991-4b29-9463-a7ee1ce515d8) [Mon May 20 03:47:58.193438 2019] [:error] [pid 32] \nCan't connect to the data
base.\nError: Can't connect to MySQL server on 'auroradb.dev.unee-t.com' (111)\n Is your database installed and up and running?\n Do you have the correct use
rname and password selected in localconfig?\n\n
Not sure if it makes sense rolling back this change which is admittedly faster than it was before, though functionally the same. It's just exposed other issues in the system.
https://media.dev.unee-t.com/2019-05-20/processlist.txt
$ grep sort processlist.txt | wc -l
78
I am worried about the Bugzilla doing stuff as well as invites saying Error 1213: Deadlock found when trying to get lock; try restarting transaction [Invite API Lambda error]
To make the RDS more available whilst expensive queries and sort indexes are being generated, I suggest we can:
Prod has been rolled back to version=41 which maps to https://github.com/unee-t/lambda2sns/commit/49a8c341fad220b213e6c495839c40086fb3bb36
It's not as fast due to setup/teardown of the SQL connection and doesn't return the error properly in some cases to retry.
lambda2sns refactor in dev has shown at least to me, the changes are fine.
RE pressure on the database, we need to tweak concurrency so that it doesn't overload the database.
The real issue as why the RDS "keeps disappearing" is probably Creating sort index as seen in https://media.dev.unee-t.com/2019-05-20/processlist.txt
The problem:
A few days ago, when I was doing a mass assignment of a user to several units via the Unee-T Enterprise interface, things were working as intended (in the DEV/Staging):
Now, when I'm trying to do a mass assignment of a user to several units via the Unee-T Enterprise interface, this is NOT working as it should:
More information:
This problem started to appear after this commit was rolled out.
See slack conversation.