NAL-i5K / Organism_Onboarding

A workflow to make organism onboarding pipeline easy to handle as an I/O pipeline
4 stars 1 forks source link

Workflow: flow_setup #54

Closed r06942072 closed 5 years ago

r06942072 commented 5 years ago

flow_setup is broken down into three parts: setup_tree download_genomic download_others

tree.cwl wget.cwl extract_md5checksums.cwl check_md5sum.cwl
gunzip.cwl

r06942072 commented 5 years ago

Technical problem of cwl solved

r06942072 commented 5 years ago
r06942072 commented 5 years ago
mpoelchau commented 5 years ago

Genomic fasta (in_fasta) needs to be here at the end of the whole workflow: data/other_species/[gggsss]/[assembly-name]/scaffold/

Genomics gff (in_gff) needs to be here at the end of the whole workflow: data/other_species/[gggsss]/[assembly-name]/scaffold/analyses/[annotation-release-number]/

It doesn't matter (at least to me) whether the files are in those directories at the end of flow_setup or flow_apollo2 - but it would be great if they can be moved there at the end of the entire workflow of workflows.

r06942072 commented 5 years ago

【change of code】

Difficulties:

  1. the File not find. Due to output section of final-workflow.cwl is excuted in very last, so when the pipeline arrive flow_apollo, those genomics data(fasta and gff) not found.

Solution:

  1. We should go back to the root problem to discuss a nicer way to capture "gff" and "fasta" from gunzip block.

Solution: append block mv_to_data in the end of flow_setup

Solution: append block mv_to_data in the end of flow_setup.