TheJacksonLaboratory / splicing-pipelines-nf

Repository for the Anczukow-Lab splicing pipeline
14 stars 10 forks source link

.bw files are duplicated in the results folder #256

Closed cgpu closed 3 years ago

cgpu commented 3 years ago

Problem

After the addition of https://github.com/TheJacksonLaboratory/splicing-pipelines-nf/commit/d3278a47a7f5a0fdc0081a126d89289ca8129f5e, to preview the results on job completion in the GitHub actions CI test, @angarb spotted a results file type duplication issue.

In short, .bw files are duplicated in the results folder, like so: See below, the example file SRR4238351.bw is duplicated in

  1. ${SRR}/<all-files>/
  2. ${SRR}/<all-files>/all_bigwig
    ├── star_mapped
    │   ├── SRR4238351
    │   │   ├── SRR4238351.Aligned.sortedByCoord.out.bam
    │   │   ├── SRR4238351.Aligned.sortedByCoord.out.bam.bai
    │   │   ├── SRR4238351.Log.final.out
    │   │   ├── SRR4238351.Log.out
    │   │   ├── SRR4238351.Log.progress.out
    │   │   ├── SRR4238351.ReadsPerGene.out.tab
    │   │   ├── SRR4238351.SJ.out.tab
    │   │   ├── SRR4238351.Unmapped.out.mate1
    │   │   └── SRR4238351.bw
    │   └── all_bigwig
    │       ├── SRR4238351.bw

Solution

Instead of declaring all the files in the publishDir for ${SRR}/<all-files>/, explicitly request using the pattern of the files added in the output directive here https://github.com/TheJacksonLaboratory/splicing-pipelines-nf/blob/21b12f9b0f154b5a27b8704b597446015154eeb6/main.nf#L584-L589.

Implementation

We can emulate the desired file patterns to match as written in the output directive omitting the *bw files, see here:

https://github.com/TheJacksonLaboratory/splicing-pipelines-nf/pull/251/commits/0c0be9693d14e37f173b213e9de76c57f6d74670

https://github.com/TheJacksonLaboratory/splicing-pipelines-nf/blob/0c0be9693d14e37f173b213e9de76c57f6d74670/main.nf#L621