GATB / gatb-minia-pipeline

GATB Minia assembly pipeline
29 stars 8 forks source link

Flag to remove intermediate files #27

Open bstamps opened 4 years ago

bstamps commented 4 years ago

Hello,

Is there a flag/option to remove intermediate files (SAM, and the .glue files specifically) during a run of the pipeline? I'm running a large assembly and the folder is > 5 TB at the moment. I didn't see any options in the help or in this repo.

harish0201 commented 4 years ago

You can always remove the *hd5 files if your contig assemblies have finished ;)

Ofcourse I'm assuming this since you have reached the mapping stage.

I also tend to remove the glue files.

bstamps commented 4 years ago

Yes, I do as well. I suppose it's a suggestion as an enhancement to the pipeline. During large assemblies where you aren't watching when to remove the files you may saturate a filesystem and cause the run to fail.

soungalo commented 2 years ago

I had the same issue and created a modified version that has a --cleanup flag that removes .h5 and glu* files after each iteration.
If one of the developers wants to review the code, I can make a PR. Otherwise I can just share the script if somebody ever needs it.

bstamps commented 2 years ago

The script would be appreciated on my end, obviously I can't speak for the devs but it does seem like the ability to clean those files after each iteration would be highly desirable when assembling very large datasets.