reanahub / reana-db

REANA database utilities
http://reana-db.readthedocs.org
MIT License
0 stars 31 forks source link

quotas: improve performance of quota updater #193

Open mdonadoni opened 1 year ago

mdonadoni commented 1 year ago

The periodic quota updater takes a long time to complete when dealing with many workflows and/or with big workspaces.

These are some possible improvements:

  1. Currently, Session.commit() is called once for every workflow in store_workflow_disk_quota (here and here). Instead, it should be called only once after all the quota updates have been calculated. This same improvements can also be applied to the other utility functions that are used to update the disk/cpu quotas of workflows and users. In particular, I have noticed that Session.commit() becomes slower the more workflows are loaded from the database.
  2. The disk quota usage of all the workflows are recalculated, even though many workflows have not changed since the last execution of the quota updater. One possible solution to this would be to consider only workflows that are "dirty", that is workflows whose workspace might have changed since the last quota update (e.g. a session has been opened, a file has been deleted/uploaded, etc.)
mdonadoni commented 1 year ago

See also https://github.com/reanahub/reana-db/pull/200#issuecomment-1693415203