Open MarieLataretu opened 1 year ago
I'm sorry I can't help directly but maybe @ktmeaton can? She's the most knowledgeable person about sc2rf as far as I'm aware :)
Hi Marie,
Here's my understanding of the problematic parameters.
--clades all
: consider all clades defined in mapping.csv
as potential parents. As of bd2a4009, there are 36
potential clades.--parents 1-35
: restrict the number of parents in the output to a minimum of 1
and a maximum of 35
. With these arguments, --parents 1-35
conflicts with --clades all
which includes 36 clades. My simple fix is to set --parents
to an extremely high number (ex. --parents 1-1000
). The following command and example data should not generate the warning about conflicting arguments.
Example data of 6 recombinants in Genbank: alignment.fasta.gz (gunzip
first)
python3 sc2rf.py alignment.fasta \
--csvfile tutorial.csv \
--breakpoints 1-2 \
--max-intermission-count 3 \
--max-intermission-length 1 \
--unique 1 \
--max-ambiguous 10000 \
--max-name-length 55 \
--clades all \
--force-all-parents \
--parents 1-1000
However, with these arguments, no recombination will be detected either. This is because BA.4 and BA.5 really complicated things. From my understanding, there are very diagnostic mutations that are exclusively found in BA.2 and not BA.4 or BA.5 (and few diagnostic mutations found in BA.5, but not BA.2 or BA.4, etc.). From my experience, BA.2, BA.4, and BA.5 cannot all be included as potential parents at the same time, one of them has to be dropped. So the following debugging parameters shuold work for the example data:
python3 sc2rf.py alignment.fasta \
--csvfile tutorial.csv \
--breakpoints 0-10 \
--max-intermission-count 3 \
--max-intermission-length 1 \
--unique 0 \
--max-ambiguous 10000 \
--max-name-length 55 \
--force-all-parents \
--parents 1-1000 \
--clades BA.1 BA.2 BA.5 21J
Thanks for your advice, @ktmeaton ! I'll check what fits best with our current usage.
The other problem was/is that input
as a positional argument won't be recognized after any argument that accepts a list.
It's somewhat clear; I just expected the readme example to work ☺️
Hi there,
I just was wondering why I have no output and tried the second example from here: https://github.com/lenaschimmel/sc2rf#no-output--some-sequences-not-shown
So I added
--clades all --force-all-parent
to my call, but it seems that they can't be used both:Also,
--clades all
can't be used as the last argument (before the input) because the input won't be recognizedI'm not sure if this is only my setup/input problem.
Would you suggest to use
-c all
or-f
? My full command isBest Marie