galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.37k stars 992 forks source link

Delete intermediate jobs - workflow option #2916

Open MoHeydarian opened 8 years ago

MoHeydarian commented 8 years ago

It would be nice to have a workflow option to 'delete intermediate files' after they have been used as input. There is currently a workflow option to 'Delete intermediate jobs if they are not used as input for another job', but this only deletes jobs that are not used by another tool.

Many analysis pipelines create intermediate files that are not used for downstream analysis and end up consuming space unnecessarily. For example, most NGS pipelines/workflows do some read trimming before alignment and once these trimmed reads are aligned they are never used. In this case, the trimmed reads are a near duplication of the input reads and contribute heavily to usage limits.

Galaxy users attempting to do NGS analysis on usegalaxy.org are constantly battling the space limitations of their account(s). Implementing a workflow option to discard intermediate files would be of great utility to these users.

dannon commented 8 years ago

+1, I like this idea Mo.

dwightkuo commented 5 years ago

+1. This is a constant desire for us as well. Despite allocating quite large user disk quotas, users run out of space very quickly when they batch process large NGS datasets.

innovate-invent commented 3 years ago

It would be great if this option could be globally overridden at invocation so that debugging can be done on files normally deleted.