sanger-pathogens / snp-sites

Finds SNP sites from a multi-FASTA alignment file
http://sanger-pathogens.github.io/snp-sites/
Other
233 stars 50 forks source link

Gaps not reported in output? #61

Closed mgalardini closed 7 years ago

mgalardini commented 7 years ago

Hi,

thanks for this very fast and elegant tool. I was wondering whether there is any option to also consider gaps. I am using this alignment, which has a region with gaps in a single sequence. I ran the following command:

snp-sites -o out.fasta aln.trimmed.fasta
head out.fasta
>genome|b0463
GCGCAA
>NT12002_188|b0463
GCGCAA
>NT12003_214|b0463
GCGCAA
>NT12004_22|b0463
GCGCAA
>NT12005_17|b0463
GTGCAA

The output is only including the snps and not the gaps. I imagine this is due to the fact that the region with gaps doesn't have any SNPs?

Is this an expected behaviour? Best, Marco

andrewjpage commented 7 years ago

Hi Marco, Yes this is expected behaviour. Gaps will only be outputted if there is a SNP in another genome. If you want to modify the code and submit a pull request with the functionality you need, I would be happy to include it. Andrew

mgalardini commented 7 years ago

Hi Andrew, my C is not nearly as good as required, but I'll try my best. Thanks for your prompt reply. Marco