Closed clami66 closed 2 years ago
@percyfal please also note that this PR changes the way some files are named by avoiding using the $REF_ID
variable which makes (I think) impossible to write snakemake rules: REF_ID
is extracted at runtime and used in the name of some output files and I did not find a way to make this work, even with checkpointing.
On the other hand, I don't know that it is necessary to use $REF_ID
since it is unique to each sample
and taxid
. I think it would be good to find consensus about this in #56 before deciding if we should merge this.
BTW, this PR fixes #56
I'm not sure I understand where $REF_ID
is used? Do you mean where you use the suffix _done
? Otherwise, what you say makes sense. I'll trigger the tests and wait for the consensus on #56 before merging.
I'm not sure I understand where
$REF_ID
is used? Do you mean where you use the suffix_done
? Otherwise, what you say makes sense. I'll trigger the tests and wait for the consensus on #56 before merging.
The output files in AUTHENTICATION
usually look like this:
b-an01 [/proj/nobackup/metagenomics/ancient-microbiome-smk/.test]$ ls -ltr results/AUTHENTICATION/bar/632/
total 316
-rw-rw----+ 1 pochonz ps30331 96 feb 17 14:20 ref_64.sorted.bam.bai
-rw-rw----+ 1 pochonz ps30331 1335 feb 17 14:20 ref_64.sorted.bam
...
-rw-rw----+ 1 pochonz ps30331 218894 feb 17 14:20 ref_64.breadth_of_coverage
-rw-rw----+ 1 pochonz ps30331 1292 feb 17 14:20 ref_64.bam
Where ref_64
is the name of the first reference sequence output by MaltExtract (which apparently doesn't always work out as seen in #56 ). This PR does without the reference name and replaces ref_64
with the taxid again. In theory we could just rename the files so that no ID is used since these are univocally identified by the folder structure /bar/632
Since
authentic.sh
is quite complex and takes lots of inputs/generates lots of outputs, I have made it into another workflow file instead. This forced me to look into the authentication code and see a few things that could be improved (e.g. issue #53 , #56)I'm sure lots of this can be improved as I'm not exactly fluent in snakemake, but I hope it can be useful