carpentries-incubator / cwl-novice-tutorial

Introduction to Workflows with Common Workflow Language
https://carpentries-incubator.github.io/cwl-novice-tutorial/
Other
11 stars 20 forks source link

Don't need to call fastqc twice #141

Open swzCuroverse opened 1 year ago

swzCuroverse commented 1 year ago

There seems to be an extra step calling fastqc twice. Fastqc is designed to both fastqcs, you don't have to call it twice. (1)

quality_control_forward: run: bio-cwl-tools/fastqc/fastqc_2.cwl in: reads_file: rna_reads_fruitfly_forward out: [html_file]

quality_control_reverse: run: bio-cwl-tools/fastqc/fastqc_2.cwl in: reads_file: rna_reads_fruitfly_reverse out: [html_file]

https://github.com/common-workflow-library/bio-cwl-tools/blob/release/fastqc/fastqc_1.cwl -- would allow you to do both. Not sure why we are using fastqc_2? Perhaps it is documented. This would make the workflow much easier to understand.

swzCuroverse commented 1 year ago

I think when fastqc_2 was written, it limits which data can be put in for some reasons but I believe fastqc can still be run on an array of files - as shown in fastqc_1.