TheJacksonLaboratory / splicing-pipelines-nf

Repository for the Anczukow-Lab splicing pipeline
14 stars 10 forks source link

Makes saving of unmapped files optional, cleanup true by default #284

Closed Vlad-Dembrovskyi closed 2 years ago

Vlad-Dembrovskyi commented 2 years ago

This PR closes #274

Description

Testing

Local

Tested on Nextflow version 20.04.1. and 20.01.0 image

--save_unmapped set to false (default)

NXF_VER=20.04.1 nextflow main.nf -profile ultra_quick_test,docker

N E X T F L O W  ~  version 20.04.1
Launching `main.nf` [tiny_boyd] - revision: 1c98b5ec77
Splicing-pipelines - N F  ~  version 0.1
=====================================
Run name                    : false
Date                        : 10-28-21
Final prefix                : 10-28-21
Assembly name               : false
Reads                       : /.../splicing-pipelines-nf/examples/testdata/single_end/tiny_reads.csv
Bams                        : false
Single-end                  : true
GTF                         : https://lifebit-featured-datasets.s3-eu-west-1.amazonaws.com/projects/jax/splicing-pipelines-nf/genes.gtf
STAR index                  : https://lifebit-featured-datasets.s3-eu-west-1.amazonaws.com/projects/jax/splicing-pipelines-nf/star_2.7.9a_yeast_chr_I.tar.gz
Stranded                    : first-strand
strType                     : 2
Soft_clipping               : true
Save unmapped               : false
rMATS pairs file            : Not provided
Adapter                     : /.../splicing-pipelines-nf/adapters/TruSeq3-SE.fa
Read Length                 : 48
Overhang                    : 100
Minimum length              : 20
Sliding window              : true
rMATS variable read length  : true
rMATS statoff               : false
rMATS paired stats          : false
rMATS novel splice sites    : false
rMATS Minimum Intron Length : 50
rMATS Maximum Exon Length   : 500
Mismatch                    : 2
filterScore                 : 0.66
sjdbOverhangMin             : 3
STAR memory                 : Not provided, Using STAR task max memory
Test                        : true
Download from               : FASTQs directly provided
Key file                    : Not provided
Outdir                      : results
Max Retries                 : 5
Max CPUs                    : 2
Max memory                  : 6 GB
Max time                    : 2d
Mega time                   : 20h
Google Cloud disk-space     : false
Debug                       : false
WARN: The access of `config` object is deprecated
Error strategy              : finish
Workdir cleanup             : true
executor >  local (18)
[7a/449e5c] process > fastqc (SRR... [100%] 4 of 4 ✔
[a1/ff3764] process > trimmomatic... [100%] 4 of 4 ✔
[53/bb719d] process > fastqc_trim... [100%] 4 of 4 ✔
executor >  local (19)
[7a/449e5c] process > fastqc (SRR... [100%] 4 of 4 ✔
[a1/ff3764] process > trimmomatic... [100%] 4 of 4 ✔
[53/bb719d] process > fastqc_trim... [100%] 4 of 4 ✔
executor >  local (19)
[7a/449e5c] process > fastqc (SRR... [100%] 4 of 4 ✔[a1/ff3764] process > trimmomatic... [100%] 4 of 4 ✔[53/bb719d] process > fastqc_trim... [100%] 4 of 4 ✔[2c/3e85e1] process > star (SRR42... [100%] 4 of 4 ✔[02/e1d39c] process > multiqc (1)    [100%] 1 of 1 ✔executor >  local (19)[7a/449e5c] process > fastqc (SRR... [100%] 4 of 4 ✔[a1/ff3764] process > trimmomatic... [100%] 4 of 4 ✔
[53/bb719d] process > fastqc_trim... [100%] 4 of 4 ✔[2c/3e85e1] process > star (SRR42... [100%] 4 of 4 ✔[02/e1d39c] process > multiqc (1)    [100%] 1 of 1 ✔
[d0/cd4322] process > collect_too... [100%] 1 of 1 ✔[32/db3266] process > collect_too... [100%] 1 of 1 ✔-[splicing-pipelines-nf] Pipeline completed successfully--[splicing-pipelines-nf] Cleanup: Working directory cleared from intermediate files generated with current run: '/.../splicing-pipelines-nf/work'  -

Completed at: 28-Oct-2021 11:23:30
Duration    : 13m 30s
CPU hours   : 1.0
Succeeded   : 19

image

NXF_VER=20.01.0 nextflow main.nf -profile ultra_quick_test,docker

N E X T F L O W  ~  version 20.01.0
Launching `main.nf` [nostalgic_sammet] - revision: 1c98b5ec77
Splicing-pipelines - N F  ~  version 0.1
=====================================
Run name                    : false
Date                        : 10-28-21
Final prefix                : 10-28-21
Assembly name               : false
Reads                       : /.../splicing-pipelines-nf/examples/testdata/single_end/tiny_reads.csv
Bams                        : false
Single-end                  : true
GTF                         : https://lifebit-featured-datasets.s3-eu-west-1.amazonaws.com/projects/jax/splicing-pipelines-nf/genes.gtf
STAR index                  : https://lifebit-featured-datasets.s3-eu-west-1.amazonaws.com/projects/jax/splicing-pipelines-nf/star_2.7.9a_yeast_chr_I.tar.gz
Stranded                    : first-strand
strType                     : 2
Soft_clipping               : true
Save unmapped               : false
rMATS pairs file            : Not provided
Adapter                     : /.../splicing-pipelines-nf/adapters/TruSeq3-SE.fa
Read Length                 : 48
Overhang                    : 100
Minimum length              : 20
Sliding window              : true
rMATS variable read length  : true
rMATS statoff               : false
rMATS paired stats          : false
rMATS novel splice sites    : false
rMATS Minimum Intron Length : 50
rMATS Maximum Exon Length   : 500
Mismatch                    : 2
filterScore                 : 0.66
sjdbOverhangMin             : 3
STAR memory                 : Not provided, Using STAR task max memory
Test                        : true
Download from               : FASTQs directly provided
Key file                    : Not provided
Outdir                      : results
Max Retries                 : 5
Max CPUs                    : 2
Max memory                  : 6 GB
Max time                    : 2d
Mega time                   : 20h
Google Cloud disk-space     : false
Debug                       : false
WARN: The access of `config` object is deprecated
Error strategy              : finish
Workdir cleanup             : true

executor >  local (7)
[48/e76fdf] process > fastqc         [100%] 4 of 4, cached: 4 ✔
executor >  local (7)
[48/e76fdf] process > fastqc         [100%] 4 of 4, cached: 4 ✔
[48/0fc41c] process > trimmomatic    [100%] 4 of 4, cached: 4 ✔
[ab/1d5fe6] process > fastqc_trimmed [100%] 4 of 4, cached: 3 ✔
[81/142fdf] process > star           [100%] 4 of 4 ✔
[27/858a85] process > multiqc        [100%] 1 of 1 ✔
[03/1f06d0] process > collect_too... [100%] 1 of 1, cached: 1 ✔
[b3/360047] process > collect_too... [100%] 1 of 1 ✔-[splicing-pipelines-nf] Pipeline completed successfully--[splicing-pipelines-nf] Cleanup: Working directory cleared from intermediate files generated with current run: '/.../splicing-pipelines-nf/work'  -
Completed at: 28-Oct-2021 11:54:33
Duration    : 11m 47s
CPU hours   : 1.0 (25.4% cached)
Succeeded   : 7
Cached      : 12

image

--save_unmapped set to true

NXF_VER=20.04.1 nextflow main.nf -profile ultra_quick_test,docker --save_unmapped true -resume

N E X T F L O W  ~  version 20.04.1
Launching `main.nf` [peaceful_thompson] - revision: 1c98b5ec77
Splicing-pipelines - N F  ~  version 0.1
=====================================
Run name                    : false
Date                        : 10-28-21
Final prefix                : 10-28-21
Assembly name               : false
Reads                       : /.../splicing-pipelines-nf/examples/testdata/single_end/tiny_reads.csv
Bams                        : false
Single-end                  : true
GTF                         : https://lifebit-featured-datasets.s3-eu-west-1.amazonaws.com/projects/jax/splicing-pipelines-nf/genes.gtf
STAR index                  : https://lifebit-featured-datasets.s3-eu-west-1.amazonaws.com/projects/jax/splicing-pipelines-nf/star_2.7.9a_yeast_chr_I.tar.gz
Stranded                    : first-strand
strType                     : 2
Soft_clipping               : true
Save unmapped               : true
rMATS pairs file            : Not provided
Adapter                     : /.../splicing-pipelines-nf/adapters/TruSeq3-SE.fa
Read Length                 : 48
Overhang                    : 100
Minimum length              : 20
Sliding window              : true
rMATS variable read length  : true
rMATS statoff               : false
rMATS paired stats          : false
rMATS novel splice sites    : false
rMATS Minimum Intron Length : 50
rMATS Maximum Exon Length   : 500
Mismatch                    : 2
filterScore                 : 0.66
sjdbOverhangMin             : 3
STAR memory                 : Not provided, Using STAR task max memory
Test                        : true
Download from               : FASTQs directly provided
Key file                    : Not provided
Outdir                      : results
Max Retries                 : 5
Max CPUs                    : 2
Max memory                  : 6 GB
Max time                    : 2d
Mega time                   : 20h
Google Cloud disk-space     : false
Debug                       : false
WARN: The access of `config` object is deprecated
Error strategy              : finish
Workdir cleanup             : true

executor >  local (6)
[1a/eb9b2e] process > fastqc (SRR... [100%] 4 of 4, cached: 4 ✔
[6a/6a0e55] process > trimmomatic... [100%] 4 of 4, cached: 4 ✔
[ab/1d5fe6] process > fastqc_trim... [100%] 4 of 4, cached: 3 ✔
executor >  local (7)
[1a/eb9b2e] process > fastqc (SRR... [100%] 4 of 4, cached: 4 ✔
[6a/6a0e55] process > trimmomatic... [100%] 4 of 4, cached: 4 ✔
[ab/1d5fe6] process > fastqc_trim... [100%] 4 of 4, cached: 3 ✔
executor >  local (7)[1a/eb9b2e] process > fastqc (SRR... [100%] 4 of 4, cached: 4 ✔[6a/6a0e55] process > trimmomatic... [100%] 4 of 4, cached: 4 ✔[ab/1d5fe6] process > fastqc_trim... [100%] 4 of 4, cached: 3 ✔[b7/63d840] process > star (SRR42... [100%] 4 of 4 ✔[fa/5bd05a] process > multiqc (1)    [100%] 1 of 1 ✔executor >  local (7)[1a/eb9b2e] process > fastqc (SRR... [100%] 4 of 4, cached: 4 ✔[6a/6a0e55] process > trimmomatic... [100%] 4 of 4, cached: 4 ✔
[ab/1d5fe6] process > fastqc_trim... [100%] 4 of 4, cached: 3 ✔[b7/63d840] process > star (SRR42... [100%] 4 of 4 ✔[fa/5bd05a] process > multiqc (1)    [100%] 1 of 1 ✔
[03/1f06d0] process > collect_too... [100%] 1 of 1, cached: 1 ✔[8c/c6dd8d] process > collect_too... [100%] 1 of 1 ✔-[splicing-pipelines-nf] Pipeline completed successfully--[splicing-pipelines-nf] Cleanup: Working directory cleared from intermediate files generated with current run: '/.../splicing-pipelines-nf/work'  -
WARN: Failed to publish file: /.../splicing-pipelines-nf/work/b7/63d840103ca1591a9395fe634ba33c/SRR4238379.Log.out; to: /.../splicing-pipelines-nf/results/star_mapped/SRR4238379/SRR4238379.Log.out [copy] -- See log file for details
Completed at: 28-Oct-2021 12:09:16
Duration    : 12m 3s
CPU hours   : 1.1 (24.9% cached)
Succeeded   : 7
Cached      : 12

image

NXF_VER=20.01.0 nextflow main.nf -profile ultra_quick_test,docker --save_unmapped true -resume

N E X T F L O W  ~  version 20.01.0
Launching `main.nf` [prickly_majorana] - revision: 1c98b5ec77
Splicing-pipelines - N F  ~  version 0.1
=====================================
Run name                    : false
Date                        : 10-28-21
Final prefix                : 10-28-21
Assembly name               : false
Reads                       : /.../splicing-pipelines-nf/examples/testdata/single_end/tiny_reads.csv
Bams                        : false
Single-end                  : true
GTF                         : https://lifebit-featured-datasets.s3-eu-west-1.amazonaws.com/projects/jax/splicing-pipelines-nf/genes.gtf
STAR index                  : https://lifebit-featured-datasets.s3-eu-west-1.amazonaws.com/projects/jax/splicing-pipelines-nf/star_2.7.9a_yeast_chr_I.tar.gz
Stranded                    : first-strand
strType                     : 2
Soft_clipping               : true
Save unmapped               : true
rMATS pairs file            : Not provided
Adapter                     : /.../splicing-pipelines-nf/adapters/TruSeq3-SE.fa
Read Length                 : 48
Overhang                    : 100
Minimum length              : 20
Sliding window              : true
rMATS variable read length  : true
rMATS statoff               : false
rMATS paired stats          : false
rMATS novel splice sites    : false
rMATS Minimum Intron Length : 50
rMATS Maximum Exon Length   : 500
Mismatch                    : 2
filterScore                 : 0.66
sjdbOverhangMin             : 3
STAR memory                 : Not provided, Using STAR task max memory
Test                        : true
Download from               : FASTQs directly provided
Key file                    : Not provided
Outdir                      : results
Max Retries                 : 5
Max CPUs                    : 2
Max memory                  : 6 GB
Max time                    : 2d
Mega time                   : 20h
Google Cloud disk-space     : false
Debug                       : false
WARN: The access of `config` object is deprecated
Error strategy              : finish
Workdir cleanup             : true

executor >  local (7)
[48/e76fdf] process > fastqc         [100%] 4 of 4, cached: 4 ✔
executor >  local (7)
[48/e76fdf] process > fastqc         [100%] 4 of 4, cached: 4 ✔
[f2/265cc3] process > trimmomatic    [100%] 4 of 4, cached: 4 ✔
[ab/1d5fe6] process > fastqc_trimmed [100%] 4 of 4, cached: 3 ✔
[b7/63d840] process > star           [100%] 4 of 4 ✔
[9c/eccadb] process > multiqc        [100%] 1 of 1 ✔
[03/1f06d0] process > collect_too... [100%] 1 of 1, cached: 1 ✔
[ba/7e7f66] process > collect_too... [100%] 1 of 1 ✔-[splicing-pipelines-nf] Pipeline completed successfully--[splicing-pipelines-nf] Cleanup: Working directory cleared from intermediate files generated with current run: '/.../splicing-pipelines-nf/work'  -
Completed at: 28-Oct-2021 12:30:15
Duration    : 12m
CPU hours   : 1.1 (25% cached)
Succeeded   : 7
Cached      : 12

image

CloudOS

--save_unmapped set to false (default)

https://cloudos.lifebit.ai/public/jobs/617a9f3788e0c901db3073b5

--save_unmapped set to true

https://cloudos.lifebit.ai/public/jobs/617a9f6388e0c901db3076f5

Vlad-Dembrovskyi commented 2 years ago

Tested successfully by @angarb on CloudOS ctrl env: https://cloudos.lifebit.ai/public/jobs/617c08502fc32701e6715b91

Tested --save_unmapped true/false: https://cloudos.lifebit.ai/public/jobs/617c18e72fc32701e6728cc7 https://cloudos.lifebit.ai/public/jobs/617c1b792fc32701e6729afe

Also tested successfully on Sumner, incl testing of the --cleanup parameter.