carpentries-incubator / snakemake-novice-bioinformatics

Introduction to Snakemake for Bioinformatics
https://carpentries-incubator.github.io/snakemake-novice-bioinformatics
Other
18 stars 9 forks source link

Sample wildcard is used for different things without a clear explanation #43

Closed tbooth closed 1 year ago

tbooth commented 1 year ago

From @jdblischak

03 - Chaining rules Note that {sample} changed in the rule kallisto_quant When introducing the rule kallisto_quant, may want to mention that the wildcard {sample} is different from the above rules. For example, now it's ref1 instead of ref1_1. This note could be added to the list under "There are many things to note here:"

tbooth commented 1 year ago

Coming back to this, I think the initial choice of {sample} for the wildcard name in chapter 2 is the problem - these are paired-end reads so the file we are counting only represents half the data for the sample, even though we discover from it the number of reads per sample.

I think I'll change this to {myfile} as the rule can count the reads in any .fq file, and the use of example variable names like "mylist", "mycount", ... is pretty typical in other programming tutorials. I'll also note that the wildcard is matching {ref1} rather than {ref1_1} as suggested.