iqtree / iqtree2

NEW location of IQ-TREE software for efficient phylogenomic software by maximum likelihood http://www.iqtree.org
GNU General Public License v2.0
237 stars 56 forks source link

BUG: Identical sequences not added to tree at end when restarting analysis from checkpoint #143

Open EricSalomaki opened 1 year ago

EricSalomaki commented 1 year ago

Hello,

I was recently running analyses on a dataset containing ~27,000 sequences (~30,000 sites). In the analysis, 306 were noted to be identical to another sequence in the dataset and would be "ignored but added in the end". Initially my analysis timed out and I restarted it from a checkpoint, however these sequences were then not added back to the resulting tree once complete (despite the rerun log stating this information). I did not have this issue when I ran the same analysis, but without needing to pick back up from a checkpoint. Hopefully this can be fixed in a future release.

bqminh commented 4 months ago

My test run with a small alignment shows that .treefile does contain all original sequences, incl. those identical ones that IQ-TREE notified, even after recovering from the checkpoint. This is observed with both v1.6.12 and v2.3.4. So I can't reproduce what you wrote. However, I recognise that the tree in .iqtree file does not have such identical sequences, no matter if IQ-TREE was interrupted or not. But that's not a major issue, assuming that most users will only take .treefile.