configurable output asset expiry

dinosk commented 6 years ago

In case a user soft deletes a workflow and then needs to download a file from it, deletion of workspaces could be gathered and applied every night. This would make the deletion task asynchronous (stemming from comment) and speed up the responses shown to the client. Allows also a small time window on the administration side to respond to any errors that may occur. For the implementation this could be a celery task triggered by celery beat on a time interval on a worker pod.

tiborsimko commented 4 years ago

This can be addressed more generally as follows:

introducing system-wide "artifacts" or "assets" retention policy with a default of e.g. one week; see also GitLab CI artifacts
users will be able to overwrite the default asset retention policy in reana.yaml to a longer value for their important runs
regardless of asset retention policy, users should be advised to copy important assets outside of the cluster via stage-out mechanism e.g. https://github.com/reanahub/reana/issues/246
the REANA platform garbage collection daemon will periodically clean all assets older than their retention policy every night
... however the daemon will keep the DB information such as run time parameters and logs and CPU consumption for accounting and tracking purposes
... see also possible future independent DR element for keeping that information
the garbage collection solution should be flexible enough to collect also non-workspace non-disk assets, e.g. stalled interactive sessions, e.g. hanging Kubernetes objects not needed anymore

tiborsimko commented 4 years ago

Example of the configuration in MVP scenario: all workflow assets expire at the same time, so we can define global value at the workflow level in reana.yaml:

workflow:
  type: serial
  resources:
      cvmfs:
        - fcc.cern.ch
      retention: 7 days
  specification:
    steps:
      - name: gendata
        environment: 'reanahub/reana-env-root6:6.18.04'
        commands:
        - mkdir -p results && root -b -q 'code/gendata.C(${events},"${data}")'
      - name: fitdata
        environment: 'reanahub/reana-env-root6:6.18.04'
        commands:
        - root -b -q 'code/fitdata.C("${data}","${plot}

where retention is numerical value and indicates the number of days when the workflow could be garbage-collected after its termination (successful or unsuccessful). It could be put under resources clause perhaps.

(We could perhaps call it expires_in: 7 days or expires_in_days: 7 if that sounds more user-friendly than retention.)

The reana-client validation should be amended to check for appropriate expiry values. E.g. we allow only integer days, e.g. we allow up to a hard-coded maximum of 14 days. Each REANA instance could have different maximum, so this may need to be gathered via REST API call and/or validated on the server side.
The reana-client list output (and ditto for some more commands) should be amended in order to display when the given workflow run expires, so that users are notified. (See also below.)
The REANA UI should be amended in order to display nearing expiry on the workflow run details page (and perhaps also on the workflow list page).
Note for the GC deamon: we should keep all the workspace inputs (because perhaps the users might not have them in Git repo) and we should remove only workflow run assets, i.e. all the files from the workspace that are not specified in inputs.

(The goal being that after GC runs, people should still be able to rerun the workflow and obtain the same results. Can be tested with reana-client restart -w myanalysis.42. IOW, this is similar to CPU-vs-HDD resource dilemma; for "hot" analysis runs, it is good to have HDD resources occupied with keeping the latest results; for "cold" analysis runs, we reduce HDD resource usage by deleting unnecessary stuff all the while keeping the ability to get the same files by engaging CPU resources to rerun the recipes.)

Note: one day before expiry, we could send email notification alert to the users to remind them that their workflow myanalysis.42 outputs will be removed and that they should copy them away if they want to. This behaviour can be made configurable in user settings in REANA UI.

tiborsimko commented 1 year ago

Implemented in 0.9.0 as part of the Workspace-Retention sprint.

reanahub / reana-workflow-controller

configurable output asset expiry #145