jameshadfield / phandango

an interactive viewer for populations of bacterial genomes linked by a phylogeny
http://phandango.net
MIT License
113 stars 27 forks source link

Clarification of genome size presented #143

Open lauramataseje opened 2 years ago

lauramataseje commented 2 years ago

Hello, I have analyzed a collection of small pieces of DNA ~6100bp which only differs by a few basepairs in my collection. Using Roary and Prokka I identified 7 reading frames which range from ~100-1400bp. I am confused as to what Phandango is presenting regarding the presence/absence file (CSV) from Roary in this image as I only see 180bp and all the reading frames are the same size... I am sure I am reading something wrong but cannot find any reference to this online. Any help would be greatly appreciated.

Phandango

jameshadfield commented 2 years ago

Hey @lauramataseje -- the number of bases on the presence/absence view is meaningless unfortunately [1], and I think we just assigned a fixed number of bases to each gene. Looking at your screenshot perhaps we're also adding a buffer of 1 reading frame either side of the 7 you have in the CSV (maybe?).

[1] I'm sure this is documented somewhere, but it's not at all obvious just by looking at it! It was because we built it for recombination output (which uses bases) and then added ROARY CSV ability at a late stage. It would be better were the base-pair markers not rendered.