sanger-pathogens / Roary

Rapid large-scale prokaryote pan genome analysis
http://sanger-pathogens.github.io/Roary
Other
315 stars 190 forks source link

core_alignment.aln has lower and uppercase in sequences #611

Open samuelmontgomery opened 7 months ago

samuelmontgomery commented 7 months ago

Hi,

I ran the suggested code for a core alignment (roary -e --mafft -p 8 *.gff) and the output alignment file has a combination of lower and upper case sequences Reading the MAFFT documentation, it is related to this, where lower case is nucleotide and uppercase is amino acid sequences, but it means that only ~20% of my sequence is recognised as nucleotides for building a tree, and it isn't recognising the lower case nucleotide sequences

Any idea how to fix that?