marbl / parsnp

Parsnp was designed to align the core genome of hundreds to thousands of bacterial genomes within a few minutes to few hours. Input can be both draft assemblies and finished genomes, and output includes variant (SNP) calls, core genome phylogeny and multi-alignments. Parsnp leverages contextual information provided by multi-alignments surrounding SNP sites for filtration/cleaning, in addition to existing tools for recombination detection/filtration and phylogenetic reconstruction.
Other
128 stars 25 forks source link

Allowed symbols in fasta files #101

Closed valery-shap closed 2 years ago

valery-shap commented 2 years ago

Hello, I have the same issue like in #100 but it doesn't help change the reference to only one contig with chromosome. Could you please tell what symbols are allowed in parsnp? only ACTG or N and U too? does it matter the register? Valery

bkille commented 2 years ago

I believe the way MUSCLE was integrated, it assumes the DNA alphabet of AGCTNagctn. That being said, you might still be able to get by with other characters, as not all parts of the DNA are passed through MUSCLE.