Open nicholas-kebbas opened 5 years ago
@nicholas-kebbas can you give me some cas users to test on login?
@labboy0276 Sure:
UN: d3roles UN: d2roles UN: dfaculty UN: dstudent UN: demployee
All passwords are StashBoard1
d3roles will give you the worst case load time
OK @nicholas-kebbas I redid a lot of how the code works for flagging users on login + assigning weights. There is no need for the batch anymore either.
I tested this on a pr multidev and my logins were around 3-4 seconds with the d3roles at the most. I imagine on live it will be much faster.
QA: http://pr-398-dgreat.pantheonsite.io/ PR: https://github.com/rjbain/dgreat/pull/398
Hey @labboy0276 The functionality seems to still be working but I'm getting 10+ second login with d3roles on http://pr-398-dgreat.pantheonsite.io/. Odd that you're getting such a faster login.
I took a quick look at new relic and there are some spikes caused by the dgreat_group module whenever I login. The chart does look a little different than it did previously:
The dgreat_views spike only looks to happen when I access the /favorite-links page to reorder.
I will double check @nicholas-kebbas but I was able to login around 3-4 seconds each time.
I know why, the tables are so huge for flagging and user weights. I was running on a truncated table base @nicholas-kebbas
Need to stare more tomorrow to see what else I can improve.
@labboy0276 Ah that makes sense. I'll try and think of some ways as well
@nicholas-kebbas OK, I noticed the custom user weights table had no indexes on it. I went ahead and created 2 indexes and this usually helps with performance. I tested it with d3roles and I logged in quick.
Can you double check?
@labboy0276 That seems to have helped. I logged in in about 5 seconds with d3roles. The spikes in new relic are also better:
@labboy0276 Did you end up running any of the blaze meter tests and notice any improvements there?
@nicholas-kebbas I was going to try and see if I could get the JMeter of Blazemeter stuffs to work today
OK
So I added another index and did some testing based off of that. It is slightly more performant based on my tests as you can see here: https://docs.google.com/spreadsheets/d/1ALBBw4kCsLt6q6_UFibPJEXLAQL7MrPkMk-NeiXHZfo/edit#gid=0
This is testing the flagging of the default links per login. That is where the huge hangups were. Each cycle is a different group adding its flags. This was all done with the d3roles login.
Also, you can see from the new relic graph the whole flow of 3 indexes seems more stable (they are the first 3 peaks):
So I then went ahead and setup a Blazemeter test and had it login with 50 users concurrently logging in via POST requests. I am not 100% sure if it was hitting all the same functionality, but I also logged in dozens of times over and over while the test was running as well. Every time I was in within a couple seconds.
You can see on this graph from 6:15 on what was going on. The peaks are where I was also logging in at the same time.
This all seems rock solid from my end @nicholas-kebbas . Going forward, the initial login per user after the script changes will be a little slow (as in 4 seconds) then it will be almost cut in half. I am sure on live this will be faster as well.
Also, when pushing this to live make sure you update the DB as well.
@labboy0276 thank you for information. We are reviewing the data and get back to you with any questions. @nicholas-kebbas @reynoldsalec
@labboy0276 Thanks John. Could you try running the blazemeter test with 1000 users concurrently? We want to get an accurate representation of how the site might respond with a high volume of simultaneous users. If that checks out we'll work on putting this into production this afternoon to see how it performs.
We encountered performance issues on our live site when we pushed the most recent Top Apps changes (https://github.com/rjbain/dgreat/commit/6b88e2f4541f2406681ce82c9ece02f8fed8e074 and https://github.com/rjbain/dgreat/commit/aaa30ccb44a382692f812f22a6f269c77f2e9065)
We pushed out the changes on the morning of Oct. 30 and disabled it the evening of Oct. 31.
Additional Info:
We saw the login time change from ~4 seconds per login before the changes were pushed out to 10+ seconds for login. Once we reverted the changes, login time went back to ~4 seconds.
We saw that users with multiple roles/groups (for example, if someone had the student, faculty, and employee role/group), their login time would take longer (15 - 20+ seconds).
It should be possible to replicate by logging in to a user with 3 roles/groups through CAS. There should be a noticeable delay. This delay seems to get longer the more users are logging in at once.