phac-nml / irida-next

IRIDA Next
https://phac-nml.github.io/irida-next/
Apache License 2.0
8 stars 2 forks source link

[ENHC0010012] Workflow executions/cleanup Service #605

Closed JeffreyThiessen closed 1 month ago

JeffreyThiessen commented 1 month ago

What does this PR do and why?

Describe in detail what your merge request does and why.

Related to #521

Adds WorkflowExecutions::CleanupService

This service deletes all blobs under the workflow_execution.blob_run_directory. These are the intermediate and output files made during the workflow executions lifespan. The files we keep are put into new blobs in the completion service, so this cleans up the unneeded files.

Also moves some blob related functions out of services and into blob_helper.rb to simplify code to be reusable in tests.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other pull requests.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

  1. Start IRIDA and run Sapporo
  2. run an execution workflow via the web interface
  3. wait for the run to be completed
  4. Check if your runs output files are writable

Depending on your docker setup for Sapporo, the files generated by nextflow may be written by root. For this test you can simply modify the directories permissions.

# In a rails console, get the runs storage directory
we = WorkflowExecution.last
we.workflow_params['input']
=> "...irida-next-core/storage/7k/f6/7kf65g4ayjp1un4vg2hgia4agiur/samplesheet.csv"
# back in regular console, make the top level of the runs directory writable
sudo chmod -R 777 ".../irida-next-core/storage/7k/f6/7kf65g4ayjp1un4vg2hgia4agiur/"
  1. run the cleanup service
# in rails console
we = WorkflowExecution.last
WorkflowExecutions::CleanupService.new(we).execute
  1. check that the files are gone
ls ".../irida-next-core/storage/7k/f6/7kf65g4ayjp1un4vg2hgia4agiur/"
>ls: cannot access '.../irida-next-core/storage/7k/f6/7kf65g4ayjp1un4vg2hgia4agiur/': No such file or directory

PR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

github-actions[bot] commented 1 month ago

Simplecov Report

Covered Threshold
92.34% 90%
ksierks commented 1 month ago

The instructions worked for me! I noticed that the input folder and samplesheet.csv were deleted within the blob directory, but not the output folder. I just wanted to make sure that was expected? Thanks.

JeffreyThiessen commented 1 month ago

The instructions worked for me! I noticed that the input folder and samplesheet.csv were deleted within the blob directory, but not the output folder. I just wanted to make sure that was expected? Thanks.

The output folder should be deleted too, but with the version of Sapporo we use the output folder is written by root. This is not expected to happen on Azure, and I have a Sapporo build here which fixes the permissions for our local use. https://github.com/phac-nml/sapporo-service/pull/1/files

For this test though, please see step 4 in my instructions for finding and adding permissions to the output directory.