xavierdidelot / ClonalFrameML

ClonalFrameML: Efficient Inference of Recombination in Whole Bacterial Genomes
GNU General Public License v3.0
105 stars 26 forks source link

What do you mean I don't understand the output file #150

Closed 17863952296 closed 10 months ago

17863952296 commented 10 months ago

I have been looking at the output files for 4 days, respectively: ML_sequence.fasta;position_cross_reference.txt;em.txt;importation_status.txt. Even though I read the use guide, I still don't understand what the values in the file represent. Could you please help me explain in detail? Why do I see in the literature that people use this software to figure out how many recombination events come from within lineages, how many recombination events come from between lineages, and how many recombination events come from outside the genome? How do you calculate this? You can even figure out who is the donor and who is the recipient in recombination events that come from between lineages. The picture below is from the literature: mbio 02781-21-sf002 (1)

xavierdidelot commented 10 months ago

You are correct, ClonalFrameML does not say anything about where the recombination events come from. If you want to do such an analysis, you will need to extract each imported segment and assess its origin using a postprocessing step which is not part of ClonalFrameML. This should be explained in the papers you mentioned.