EventideSystems / tool_for_systemic_change

GNU Affero General Public License v3.0
3 stars 0 forks source link

Performance Improvements #450

Closed ferrisoxide closed 3 years ago

ferrisoxide commented 4 years ago

Problem

Wicked is currently running on a single app server. Long running process (e.g. running reports, or importing data) have the potential to block operations, causing the application to "hang". It is relatively easy to block the application - two users running reports at the same time could cause the application to become unresponsive for all other users.

Solution

Adding a worker server is a fairly conventional way to scale an application. Essentially, long running jobs are placed on a queue and executed in sequence. The main web server continues operating, taking requests and performing normal operations, while the worker server performs the long running jobs. As each job finishes the worker server communicates back to the web server, which in turn communicates the results of the job to the end user.

Under this model any number of users could run reports - or other long running jobs - without directly impacting any other user. The worker server could have hundreds of jobs sitting in its queue and would eventually process all of them, given enough time. The only perceived impact for a user interacting with the system while a number of jobs are being executed is that their own queued jobs would take a bit longer.

Costs

Adding a worker server adds $20+GST to monthly operating costs.

Converting existing process to jobs that can be consumed by the worker process is a separate activity, with separate costs. A suggested schedule of work is as follows:

NB there is no requirement to do any of these in any particular order - or at all for that matter. This proposal acts as a placeholder for work that may be done to manage the risk around potential performance issues.

In any case, it is recommended that any of this work occur after upgrading to Rails 6 as this version provides improved mechanisms for reporting job execution back to the user interface (via ActionCable).

emily-humphreys commented 4 years ago

Yep lets do it

On Wed, Jun 17, 2020 at 7:31 PM Tom Tuddenham notifications@github.com wrote:

Problem

Wicked is currently running on a single app server. Long running process ( e.g. running reports, or importing data) have the potential to block operations, causing the application to "hang". It is relatively easy to block the application - two users running reports at the same time could cause the application to become unresponsive for all other users. Solution

Adding a worker server is a fairly conventional way to scale an application. Essentially, long running jobs are placed on a queue and executed in sequence. The main web server continues operating, taking requests and performing normal operations, while the worker server performs the long running jobs. As each job finishes the worker server communicates back to the web server, which in turn communicates the results of the job to the end user.

Under this model any number of users could run reports - or other long running jobs - without directly impacting any other user. The worker server could have hundreds of jobs sitting in its queue and would eventually process all of them, given enough time. The only perceived impact for a user interacting with the system while a number of jobs are being executed is that their own queued jobs would take a bit longer. Costs

Adding a worker server adds $20+GST to monthly operating costs.

Converting existing process to jobs that can be consumed by the worker process is a separate activity, with separate costs. A suggested schedule of work is as follows:

  • Outgoing emails (e.g. sign up confirmation, password changes, invitations) $100+GST
  • Generate reports (per Reports page) $500+GST
  • Import Comments, Organisations $400+GST

NB there is no requirement to do any of these in any particular order - or at all for that matter. This proposal acts as a placeholder for work that may be done to manage the risk around potential performance issues.

In any case, it is recommended that any of this work occur after upgrading to Rails 6 as this version provides improved mechanisms for reporting job execution back to the user interface (via ActionCable).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ferrisoxide/wicked_software/issues/450, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADFVZOSR4OIPAQLJ66ZSMKLRXCIAZANCNFSM4OAOAPEQ .

ferrisoxide commented 3 years ago

Per #555, we've actually been able to move to a "serverless" model using AWS Lambda. This is a more sensible approach than running up additional servers.

Will some more work here to finish off the work started in #555, removing unused code and reduce the overall load on the existing server infrastructure.

ferrisoxide commented 3 years ago

@emily-humphreys A new build is up on staging. I've renamed this ticket to "Performance Improvements" with the intent of squeezing out the last performance improvements we can get out of the system without having to resort to spinning up extra services. One day, maybe, we will need a background service, but its better to strengthen the underlying app before throwing extra services at it. #555 also gives us some breathing space.

Have a look at staging, give it a run. There's probably more I can do to speed things up, but I'm using what's currently on staging as a baseline and I'll be tracking any changes against this ticket now (#450). I'll put a couple more hours into this and then we'll call the pre-paid work done.

Looking forwards to catching up on Thursday. There's probably plenty we'll have to talk about re future directions.

emily-humphreys commented 3 years ago

I'm not sure how I QA this - I just loaded a bunch of Transition Cards and Ecosystem Maps on staging and they look like they load quickly. The only change I could see is that before you when you clicked on a Transition Card it would go there immediately, but it would take time to render. Now it has a spinning wheel while it processes and then takes you to the fully rendered TC.

Is this the change I should see?

Also for some reason, this Transition Card has multiple WA School Canteen Association Initiatives (but it is only listed once as an initiative in the Initiative Tabs). It looks unrelated to this but thought I'd point it out

https://wickedlab-staging.herokuapp.com/transition_cards/4

On Tue, Mar 30, 2021 at 8:43 PM Tom Tuddenham @.***> wrote:

@emily-humphreys https://github.com/emily-humphreys A new build is up on staging. I've renamed this ticket to "Performance Improvements" with the intent of squeezing out the last performance improvements we can get out of the system without having to resort to spinning up extra services. One day, maybe, we will need a background service, but its better to strengthen the underlying app before throwing extra services at it. #555 https://github.com/ferrisoxide/wicked_software/issues/555 also gives us some breathing space.

Have a look at staging, give it a run. There's probably more I can do to speed things up, but I'm using what's currently on staging as a baseline and I'll be tracking any changes against this ticket now (#450 https://github.com/ferrisoxide/wicked_software/issues/450). I'll put a couple more hours into this and then we'll call the pre-paid work done.

Looking forwards to catching up on Thursday. There's probably plenty we'll have to talk about re future directions.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ferrisoxide/wicked_software/issues/450#issuecomment-810098420, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADFVZOSBWLWCZG3GYHUDQWTTGGP5VANCNFSM4OAOAPEQ .

ferrisoxide commented 3 years ago

@emily-humphreys Ah, sorry - should have left some notes. It's basically a "smoke test" - i.e. does it appear to function as before. There shouldn't be any significant changes, other than it should be a bit faster in places.

I'll look into the WA School Canteen Association Initiatives. It probably is unrelated, but worth checking.

ferrisoxide commented 3 years ago

@emily-humphreys

Ah... I see what is going on. There are actually 2 initiatives with the same name, but they just get repeated.

image

The code that builds up the transition card wasn't expecting initiatives with the same name. I've just added a validation to prevent two initiatives within the same transition card from having the same name (initiatives in different transition cards can share the same name). That should prevent the same issue occurring again - or at least prevent users from inadvertently entering the same initiative two.

I'll have to come back to this and figure out why the initiatives we're being repeated. In the meantime I've just changed the second initiative here to 'WA School Canteens Association 2'.

emily-humphreys commented 3 years ago

Thank s for checking it out - but the multiples were in the TC (there were about 12).

On Wed, 31 Mar 2021 at 6:33 pm, Tom Tuddenham @.***> wrote:

@emily-humphreys https://github.com/emily-humphreys

Ah... I see what is going on. There are actually 2 initiatives with the same name, but they just get repeated.

[image: image] https://user-images.githubusercontent.com/6169/113105730-75b50e80-9249-11eb-8f10-7795b16ace08.png

The code that builds up the transition card wasn't expecting initiatives with the sane name. I've just added a validation to prevent two initiatives within the same transition card from having the same name (initiatives in different transition cards can share the same name). That should prevent the same issue occurring again - or at least prevent users from inadvertently entering the same initiative two.

I'll have to come back to this and figure out why the initiatives we're being repeated. In the meantime I've just changed the second initiative here to 'WA School Canteens Association 2'.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ferrisoxide/wicked_software/issues/450#issuecomment-810865017, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADFVZOWCCJSYESXWBRJNKMTTGLJO5ANCNFSM4OAOAPEQ .

ferrisoxide commented 3 years ago

The multiples were displayed, but it was for the same two records over and over. They all had the id of 149 or 150 (the ids of those two in the screenshot before). I could see it when I hovered over the links - each of the dozen or so initiatives pointed to one of those two.

Fundamentally, I need to fix the TC code so it's not sensitive to repeated names, but preventing repeats will also help. I've a note to look into the why the TC summary is generated.

emily-humphreys commented 3 years ago

Great - Thanks TT!

On Wed, Mar 31, 2021 at 8:36 PM Tom Tuddenham @.***> wrote:

The multiples were displayed, but it was for the same two records over and over. They all had the id of 149 or 150 (the ids of those two in the screenshot before). I could see it when I hovered over the links - each of the dozen or so initiatives pointed to one of those two.

Fundamentally, I need to fix the TC code so it's not sensitive to repeated names, but preventing repeats will also help. I've a note to look into the why the TC summary is generated.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ferrisoxide/wicked_software/issues/450#issuecomment-810945238, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADFVZOURYCJ4CG5XTOF6OLDTGLX2HANCNFSM4OAOAPEQ .

ferrisoxide commented 3 years ago

Closing this. It's become a big of a grab-bag of stuff.