GSA / notifications-api

The API powering Notify.gov
Other
10 stars 1 forks source link

Identify any possible performance improvements to be had with the time-based processing #910

Open ccostino opened 5 months ago

ccostino commented 5 months ago

In an effort to help alleviate the current issues we're having with larger time-based reports being generated - they will timeout and the download request will fail before the report is generated - we'd like to see if there are any performance improvements that could be made.

While this work will be for a shorter term fix while we continue discussing and planning what a long-term fix and improvement looks like, any performance gains and improvements to the processing that happens now will help the effort in the future as well.

Implementation Sketch and Acceptance Criteria

Security Considerations

terrazoon commented 4 months ago

I did some profiling locally after doing 50 one-off messages:

the downloads (if the job is not in the cache) take anywhere from 250 milliseconds to 1400 and average out at about 400, which is 90% of the time the report needs to generate. So there is no optimization available in the code itself. I think the options might be:

terrazoon commented 1 month ago

The issue at this point has nothing to do with inefficiencies in the code, but rather inefficiencies in the design.

We are now downloading -- via a task -- every thirty minutes to try to "top up" our in-memory cache. But the in-memory cache is wiped out on restarts, and in addition, each worker needs its own cache.

This would be fixable by moving the jobs data to redis, but that is not a desirable solution so we are left with possible adjusting the report run time (?), maybe to every 15 minutes? But that comes with another set of tradeoffs.