Open wwood opened 1 year ago
FWIW, it seems like that switch happens after an indel, as observed with a second example not shown.
Hi, I think there is a misunderstanding here:
when using a fasta as input, the REFERENCE among the fasta records must be specified using option -R
. Otherwise the most frequent base is used as ref.
then option -f
is used to save the consensus:
-f, --fasta
save computed fasta sequence in this file.
furthermore, you should also have a look at : https://github.com/sanger-pathogens/snp_sites
Hi,
Thanks for the quick response. You're right, I hadn't quite appreciated about the consensus. From the help it says
-R, --REF
reference name used for the CHROM column. Optional
Default: chrUn
So as I understand, that only changes what appears in the first column, and doesn't impact what the sequence of the reference is. Are you saying it also changes the consensus?
Anyway, I tried to workaround by adding the same ref sequence twice, slightly changing the name of one to maintain uniqueness. Then finally the non-ref sequence. So the consensus should always be the reference.
That worked for the above example, but it seemed to do something unexpected when the reference has gaps. Specifically, including gaps, the input sequences were 5002 bp, and 5000 bp not including them. I was expecting the consensus to be 5000 bp, but it comes out as 5002 bp. Is that expected behaviour? If you don't quite understand I can cook up a reduced example as above?
snp_sites is what I tried first, but that doesn't seem to have any option to report INDELs, only SNPs. Maybe there's something I missed?
Anyway, happy to help further here if helpful for you, but I just decided to write this code myself, since I only have 2 sequences for my use-case. Let me know.
Subject of the issue
Output seems incorrect.
Your environment
On Linux, see output for version info
Steps to reproduce
With this input:
and running
The first 2 are right, but the 2nd two aren't - they are T->C not C->T, and T->A not A->T. Did I get that right? Here's the blast output to help visualise
Otherwise this tools seems like exactly what we need, so any help much appreciated. Thanks, ben