sate-dev / sate-core

3 stars 3 forks source link

SATe sometimes outputting wrong labels on tree with checkpointing feature #10

Closed smirarab closed 12 years ago

smirarab commented 12 years ago

If checkpointing is enabled (branch checkpoint) sometimes SATe outputs a tree with wrong labels (i.e. labels used internally in SATe instead of original labels). The labels are fine in the outputted alignment file.

This is a bug related to checkpointing code. My guess is that if the tree accepted at the end is from checkpoint file, for some reason it is not translated back.

jeetsukumaran commented 12 years ago

Hi Siavash,

Does the checkpointed tree have the original labels or the internal labels? If the latter, I suspect that when the tree is read in from the checkpointed file, it then takes the internal labels as the internal labels. The solution would be to ensure restoration of the original label names to the tree before writing when checkpointing.

smirarab commented 12 years ago

the checkpointed tree uses internal labels. What you described was my first guess also. I will take a look at this problem at first opportunity. Thanks!

smirarab commented 12 years ago

This issue is fixed in the latest code from "checkpoint" branch.

The problem was that sateJob object is inheriting from TreeHolder object. TreeHolder object has a "dataset" attribute, which is used at the end to output the final tree. I had missed the inheritance relation, and therefore had failed to restore "dataset" after a checkpoint was being recovered. Therefore a new "dataset" object was being used, and that somehow means that the name mappings get lost.

I just had to add a line to make sure "dataset" attribute of SateJob is also recovered when a job is being recovered (fortunately it was already backedup).