bcgov / NRPTI

Natural Resources Public Transparency Initiative
Apache License 2.0
5 stars 15 forks source link

Reduce Audit Collection size #1257

Open sggerard opened 2 months ago

sggerard commented 2 months ago

Describe the Bug The Audit collection in the NRPTI mongodb prod database is huge (19 million records) and has been causing the backups to be slow and even crash.

The Audit collection is used to store events such as record creation and publishing but is also currently storing every search query ever made, and multiple search queries are made by the front end on every page load.

Expected Behaviour The Audit collection should be a reasonable size and not several million records.

Actual Behaviour The Audit collection is massive and causing issues with the backups.

Implications Backups are critical for production data such as NRPTI records and backing up and restoring such a large collection consumes additional resources leading to hitting limits causing crashes.

I would suggest not storing search queries in the Audit table or clearing out records older than say a year. A migration will likely be necessary.

Affected code: search.js line 787

QueryUtils.audit(args,
    'Search',
    keywords,
    args.swagger.params.auth_payload ? args.swagger.params.auth_payload
      : { idir_userid: null, displayName: 'public', preferred_username: 'public' }
  );