nf-core / rnaseq

RNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.
https://nf-co.re/rnaseq
MIT License
927 stars 709 forks source link

add qualimap #202

Closed lpantano closed 5 years ago

lpantano commented 5 years ago

It would be redundant with rseqc maybe but I think is much faster for some cases and we can give the option to use one or another.

Is ok if I work on adding qualimap for QC?

apeltzer commented 5 years ago

Sure, why not ?

lpantano commented 5 years ago

I feel quite stupid, but I tried to add it (and I have done successful pipelines before with nf) but for some reason, it is only doing qualimap for one sample, maybe somebody can help me?

I added the code here: https://github.com/lpantano/rnaseq/blob/dev/main.nf#L896

like the output says:

>nextflow run main.nf -profile test --aligner hisat2 --skip_rseq --skip_genebody_coverage
N E X T F L O W  ~  version 19.04.0
Launching `main.nf` [amazing_ptolemy] - revision: 7b66a56c5b
Run Name          : amazing_ptolemy
Reads             : data/*{1,2}.fastq.gz
Data Type         : Single-End
Strandedness      : None
Trimming          : 5'R1: 0 / 5'R2: 0 / 3'R1: 0 / 3'R2: 0
Aligner           : HISAT2
Fasta Ref         : https://github.com/nf-core/test-datasets/raw/rnaseq/reference/genome.fa
GTF Annotation    : https://github.com/nf-core/test-datasets/raw/rnaseq/reference/genes.gtf
Save prefs        : Ref Genome: No / Trimmed FastQ: No / Alignment intermediates: No
Max Resources     : 6 GB memory, 2 cpus, 2d time per job
Output dir        : ./results
Launch dir        : /net/storage001.ib.cluster/om2/user/lpantano/pipelines/dev/rnaseq
Working dir       : /net/storage001.ib.cluster/om2/user/lpantano/pipelines/dev/rnaseq/work
Script dir        : /net/storage001.ib.cluster/om2/user/lpantano/pipelines/dev/rnaseq
User              : lpantano
Config Profile    : test
Config Description: Minimal test dataset to check pipeline function
executor >  local (51)
[32/0456b5] process > get_software_versions [100%] 1 of 1 ✔
[c4/877894] process > makeHisatSplicesites  [100%] 1 of 1 ✔
[6e/52e9c2] process > makeBED12             [100%] 1 of 1 ✔
[8f/4745c5] process > output_documentation  [100%] 1 of 1 ✔
[0e/813c47] process > fastqc                [100%] 4 of 4 ✔
[77/e257cd] process > trim_galore           [100%] 4 of 4 ✔
[f3/e96f0b] process > makeHISATindex        [100%] 1 of 1 ✔
[6b/759aa8] process > hisat2Align           [100%] 4 of 4 ✔
[96/ff288c] process > hisat2_sortOutput     [100%] 4 of 4 ✔
[cc/da35a8] process > preseq                [100%] 4 of 4 ✔
[a5/3396e3] process > markDuplicates        [100%] 4 of 4 ✔
[db/830b64] process > bam_subsample         [100%] 2 of 2 ✔
[ce/c18036] process > rseqc                 [100%] 4 of 4 ✔
[be/03b686] process > qualimap              [100%] 1 of 1 ✔
[4a/0b3b18] process > stringtieFPKM         [100%] 4 of 4 ✔
[c1/243485] process > featureCounts         [100%] 4 of 4 ✔
[56/25a2f2] process > dupradar              [100%] 4 of 4 ✔
[63/909808] process > sample_correlation    [100%] 1 of 1 ✔
[74/d31239] process > merge_featureCounts   [100%] 1 of 1 ✔
[c3/1b7469] process > multiqc               [100%] 1 of 1 ✔
Completed at: 02-May-2019 17:13:25
Duration    : 4m 8s
CPU hours   : 0.1
Succeeded   : 51
apeltzer commented 5 years ago

Probably used a queue channel for e.g. the reference data

lpantano commented 5 years ago

lol, ok, got it.

drpatelh commented 5 years ago

Yep.

file gtf from gtf_qualimap.collect()

Should fix it.

lpantano commented 5 years ago

thanks. That was fun!

Jokendo-collab commented 5 years ago

Dear All,

I am Javan Okendo, a PhD student from the University of Cape Town, South Africa. I have a single end RNAseq data which I would like to analyze using nexflow pipeline. The reason for writing is to request anyone who has a pipeline which can analyze such data to point me in the right direction. Github links or any suggestion will be of great help.

Thank you,

*Javan Okendo, BSc, MScMedPh.D. Candidate (Computational Proteomics and Bioinformatics) | Blackburn Lab | Systems and *Chemical Biology Division | Department of Integrative Biomedical Sciences| Faculty of Health Sciences |

On Fri, May 3, 2019 at 2:45 AM Lorena Pantano notifications@github.com wrote:

thanks. That was fun!

On May 2, 2019 at 17:53:12, Harshil Patel (notifications@github.com) wrote:

Yep.

file gtf from gtf_qualimap.collect()

Should fix it.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nf-core/rnaseq/issues/202#issuecomment-488845963, or mute the thread < https://github.com/notifications/unsubscribe-auth/AAML6HACJG6JEMEJSB7A653PTNPERANCNFSM4HKBQINA

.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nf-core/rnaseq/issues/202#issuecomment-488880687, or mute the thread https://github.com/notifications/unsubscribe-auth/AGJ34O7MF7FAEIB3SUDBMRDPTODLTANCNFSM4HKBQINA .

drpatelh commented 5 years ago

Hi Javan. Thats great! However, you are posting this message in completely the wrong place :wink:. This post is reserved for specific discussion related to the addition of qualimap to the nf-core/rnaseq pipeline.

The best thing to do would be to join the nf-core Slack workspace: https://nf-core-invite.herokuapp.com/

Join the rnaseq channel once there, and someone will be able to have a more informal chat with you.

To start I would recommend checking out the nf-core/rnaseq pipeline, and reading some of the extensive docs that have been created for this very purpose: https://github.com/nf-core/rnaseq

If something doesnt make sense or if you are having problems getting the pipeline running then please post on the Slack rnaseq channel.

Look forward to seeing you there!

lpantano commented 5 years ago

Ok, all is working and I am going through the list to make the pull request, but I have a couple of questions:

Thanks!

apeltzer commented 5 years ago

Ideally, a.) should resolve itself once we updated the environment.yml on the dev branch. You can also open a separate PR to update some conda packages (please not STAR, there is a comment in the environment why not) which you could use in a separate PR to include qualimap before you actually do the final PR including the code for qualimap.

That way, the tests will find qualimap properly to test everything when you do your "real" PR ;-)

Since adding just some recipes shouldn't break things, I think its fine to do a separate PR that just includes the environment.yml and CHANGELOG changes. Maybe something we should document as well.

lpantano commented 5 years ago

Oh! that is smart :) I’ll do that PR first!