genome / analysis-workflows

Open workflow definitions for genomic analysis from MGI at WUSM.
MIT License
102 stars 57 forks source link

CWL workflow for building reference indices #49

Open jasonwalker80 opened 7 years ago

jasonwalker80 commented 7 years ago

Input is FASTA reference. Output is the aligner index, samtools faidx and picard sequence dictionary.

jasonwalker80 commented 7 years ago

Resolving this for now. The indices are currently required input and alignment is not a requirement at this time.

jasonwalker80 commented 7 years ago

More and more it seems creating a workflow to generate reference sequence index files and aligner index files will be a future need. RNA-seq aligners like HiSAT2 have mutli-step workflows to generate the right files. In addition, there are issues with VEP requiring it's own index file and write permissions, see #177 and #179. Also creating aligner index files is a known issue that needs to be tackled in the GMS for CWL pipeline support.