medvedevgroup / SibeliaZ

A fast whole-genome aligner based on de Bruijn graphs
http://medvedevgroup.com/
Other
141 stars 19 forks source link

question about LCB determination #50

Open 0xaf1f opened 1 year ago

0xaf1f commented 1 year ago

I was interested in how the LCBs called by SibeliaZ change as I include more genomes, so to start, I ran it with two ~4.5Mb genomes that differed only by a single ~10kb inversion that I introduced. I expected to see something like 3 LCBs since the sequences are otherwise identical, but I was getting around 2100. I can't find anything in the algorithm description in your paper to explain this. There should only be bubbles in the vicinity of the inversion that I introduced, but nowhere else. Is the large number of LCBs for these near-identical genomes expected behavior, and how?