xavierdidelot / ClonalFrameML

ClonalFrameML: Efficient Inference of Recombination in Whole Bacterial Genomes
GNU General Public License v3.0
107 stars 27 forks source link

About branches with 0 length #55

Closed krishnansandeep1980 closed 7 years ago

krishnansandeep1980 commented 7 years ago

Hello, I have one question on ClonalFrameML. When I used ClonalFrameML to reconstruct a tree of E. coli strains, some of the branches are shrank to almost 0 length, and they are not shown in the em.txt file. I tried the option '-mcmc_infer_branch_lengths' mentioned in the supplementary text in the plos paper, but the program says no such option. So is it possible to force the program to predict branch length for all the branches? Thank you very much.

xavierdidelot commented 7 years ago

This sounds a bit strange to me. All branches of the input tree should always be shown in the output tree and in the em.txt file, but they might be given different names. Could you please check, and if you're still having a problem attach your input tree and output tree?

krishnansandeep1980 commented 7 years ago

Hello,

There are sequences of 10 E coli strains in the example "tmp10strains", where the sequences are from the core genes. I reconstructed their tree using BEAST, and then converted the BEAST posterior trees to an MCCT for ClonalFrameML. There are 8 non-terminal branches in the input tree and also in the ClonalFrameML output (labelled_tree.newick). But in the em.txt file, it has information on 7 non-terminal branches.

In another example ("example100genes"), it contains 10 E. coli strains, and the sequences are from 100 core genes. I made an MCCT tree from the BEAST reconstruction and used it as the input of ClonalFrameML. In the em.txt file, there is no branch length prediction for the non-terminal branches.

Data: https://mega.nz/#F!JrBDDCDA!lPgAuuY9WJCWrjsfsarTmQ

I have a further question. Is there a way to change the termination condition of the ClonalFrameML simulation, e.g., # of Monte Carlo steps or burnin?

Thank you very much for your help.

On Wed, May 3, 2017 at 12:17 PM, Xavier Didelot notifications@github.com wrote:

Closed #55 https://github.com/xavierdidelot/ClonalFrameML/issues/55.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/xavierdidelot/ClonalFrameML/issues/55#event-1066972122, or mute the thread https://github.com/notifications/unsubscribe-auth/AaKUEzps1EEXdeChI6NU-4SCm6sjFk-tks5r2FQngaJpZM4NMiNR .

xavierdidelot commented 7 years ago

For the "tmp10strains" example, there are indeed 8 non-terminal branches in the input tree, but ClonalFrameML considers the tree as unrooted so that the two branches descending from the root are in fact just a single branch, and we are left with 7 non-terminal branches in both the output newick file and the em.txt file.

For the "example100genes" the internal branches disappear because ClonalFrameML does not understand the input tree due to the very large branch length values. If you divide all branch lengths of the input tree by 1e6 then it works.

In answer to your other question, ClonalFrameML is not a MCMC so you can't change the number of steps.

By the way, it's not such a good idea to use BEAST to create the CFML input tree, I would recommend a ML method such as raxml or phyml.