The reference genome(s) need to be accepted as a parameter from the command-line. It's not ever going to be a static file included with the workflow - it's going to be dependent on the users needs given their dataset and the questions they wish to answer. Also, it should be possible to specify multiple reference genomes with --refs "/path/to/reference_genomes/*.fasta" when running the workflow.
In the workflow body, you'll need to create a Channel for the reference genome files:
workflow {
ch_refs = Channel.fromPath(params.refs)
// combine all read sets with each reference
ch_reads_and_refs = ch_reads | combine(ch_refs)
// each value in ch_reads_and_refs will contain [ sample_id, reads1, reads2, ref_fasta ]
MAP(ch_reads_and_refs)
}
The reference genome(s) need to be accepted as a parameter from the command-line. It's not ever going to be a static file included with the workflow - it's going to be dependent on the users needs given their dataset and the questions they wish to answer. Also, it should be possible to specify multiple reference genomes with
--refs "/path/to/reference_genomes/*.fasta"
when running the workflow.In the workflow body, you'll need to create a Channel for the reference genome files: