carpentries-incubator / snakemake-novice-bioinformatics

Introduction to Snakemake for Bioinformatics
https://carpentries-incubator.github.io/snakemake-novice-bioinformatics
Other
16 stars 9 forks source link

Introduce lists of inputs/outputs before named inputs and outputs #64

Open tbooth opened 1 month ago

tbooth commented 1 month ago

From @cmeesters:

[in ep03] we get to named in- and output without explaining the background or that in- and output are usually lists.

tbooth commented 1 month ago

I'm not sure what "the background" is here, but it is true that many new users of Snakemake will use lists where they need multiple inputs and outputs, resulting in shell commands like:

"kallisto quant -i {input[0]} -o kallisto.{wildcards.sample} {input[1]} {input[2]}"

rather than the much more readable:

"kallisto quant -i {input.index} -o kallisto.{wildcards.sample} {input.fq1} {input.fq2}"

So I'd say this is a feature of the course, not a bug. We're introducing things in this order:

1) Input and output are single files (in ep01) 2) Actually you can have multiple input and/or output files and assign them names (in ep03) 3) Actually you can use a list, if the number of input of files is variable (in ep05)

There are some funky cases where we actually need the output of a rule to be a list, but these are not for an intro course!

I'll add exposition to this effect to the instructor notes.