tbigot / tpms

Tree Pattern Matching Suite
Other
1 stars 0 forks source link

missed root in tpms_computations RU #8

Closed flassalle closed 12 years ago

flassalle commented 12 years ago

One case where the output of RU is not (one of) the root giving the minimal sum of unicity scores.

here is the original tree (as in tree database formated by tpms_mkdb):

HBG001548.phb [ ATU7B_1_SLYA"ATU7B" ATU7C_1_SLYA"ATU7C" ATU7A_1_SLYA"ATU7A" ATU13_1_SLYA"ATU13" ATU5A_1_SLYA"ATU5A" ATU2A_1_SLYA"ATU2A" ATU4C_1_SLYA"ATU4C" ATU4B_1_SLYA"ATU4B" ATU4A_1_SLYA"ATU4A" ATU9A_1_SLYA"ATU9A" ATU1C_1_SLYA"ATU1C" ATU1B_1_SLYA"ATU1B" ATU1A_1_SLYA"ATU1A" ATU8A_1_SLYA"ATU8A" AGRT5_1_PE1067"AGRT5" ATU3A_1_SLYA"ATU3A" ATU6A_1_SLYA"ATU6A" AGRRK_1_PE1234"AGRRK" SINMW_1_PE731"SINMW" SIMEL1_2_PE1136"SIMEL1" ATU1C_3_PE463"ATU1C" AGRVS_2_PE568"AGRVS" AGRRK_1_PE1507"AGRRK" ATU4A_1_PE3006"ATU4A" ] (((((((((((ATU7B_1_SLYA:1e-10,ATU7C_1_SLYA:0.0153245733)0.0:4.515e-07,ATU7A_1_SLYA:1e-10)0.0:9.03e-07,((ATU13_1_SLYA:1e-10,ATU5A_1_SLYA:1e-10)0.755:0.0077356816,ATU2A_1_SLYA:0.0154178072)0.759:0.0075908841)0.0:5.02e-07,((ATU4C_1_SLYA:1e-10,ATU4B_1_SLYA:1e-10)0.856:0.0076550764,ATU4A_1_SLYA:5.273e-07)0.836:0.0155106944)0.778:0.0102347771,ATU9A_1_SLYA:0.0053604199)0.855:0.0154048532,(ATU1C_1_SLYA:1e-10,(ATU1B_1_SLYA:1e-10,ATU1A_1_SLYA:1e-10)0.0:1e-10)0.886:0.0155730925)0.433:0.0095589268,((ATU8A_1_SLYA:0.0077396087,(AGRT5_1_PE1067:7.681e-07,ATU3A_1_SLYA:0.0077792728)0.0:5.88e-07)0.0:5.325e-07,ATU6A_1_SLYA:0.0156634547)0.745:0.0142423688)0.999:0.4409643582,AGRRK_1_PE1234:0.3312947792):0.1093171772,(SINMW_1_PE731:0.0740601495,SIMEL1_2_PE1136:0.0387884966):0.2577709492)0.685:0.1878338621,ATU1C_3_PE463:1.1638138625):0.0478866248,((AGRVS_2_PE568:0.2993867508,AGRRK_1_PE1507:0.2774220742):0.4824473097,ATU4A_1_PE3006:0.9582788272):0.9996290192)0.931:0;

and here is the output after running RU followed by AU programs:

(([0]AGRRK_1_PE1234:0.331295,(([0]SINMW_1_PE731:0.0740601,[0]SIMEL1_2_PE1136:0.0387885)[0]:0.257771,([0]ATU1C_3_PE463:1.16381,(([0]AGRVS_2_PE568:0.299387,[0]AGRRK_1_PE1507:0.277422)[0]:0.482447,[0]ATU4A_1_PE3006:0.958279)[0]:1.04752)[0]:0.187834)[0]0.685:0.109317)[0.693147]:0.220482,((((((([0]ATU7B_1_SLYA:1e-10,[0]ATU7C_1_SLYA:0.0153246)[0]0.0:4.515e-07,[0]ATU7A_1_SLYA:1e-10)[0]0.0:9.03e-07,(([0]ATU13_1_SLYA:1e-10,[0]ATU5A_1_SLYA:1e-10)[0]0.755:0.00773568,[0]ATU2A_1_SLYA:0.0154178)[0]0.759:0.00759088)[0]0.0:5.02e-07,(([0]ATU4C_1_SLYA:1e-10,[0]ATU4B_1_SLYA:1e-10)[0]0.856:0.00765508,[0]ATU4A_1_SLYA:5.273e-07)[0]0.836:0.0155107)[0]0.778:0.0102348,[0]ATU9A_1_SLYA:0.00536042)[0]0.855:0.0154049,([0]ATU1C_1_SLYA:1e-10,([0]ATU1B_1_SLYA:1e-10,[0]ATU1A_1_SLYA:1e-10)[0]0.0:1e-10)[0]0.886:0.0155731)[0]0.433:0.00955893,(([0]ATU8A_1_SLYA:0.00773961,([0]AGRT5_1_PE1067:7.681e-07,[0]ATU3A_1_SLYA:0.00777927)[0]0.0:5.88e-07)[0]0.0:5.325e-07,[0]ATU6A_1_SLYA:0.0156635)[0]0.745:0.0142424)[0]0.999:0.220482)[2.07944];

while the minimum-sum-of-unity-scores tree is the following:

((((((((((ATU7B_1_SLYA:1.0E-10,ATU7C_1_SLYA:0.0153245733)0.0:4.515E-7,ATU7A_1_SLYA:1.0E-10)0.0:9.03E-7,((ATU13_1_SLYA:1.0E-10,ATU5A_1_SLYA:1.0E-10)0.755:0.0077356816,ATU2A_1_SLYA:0.0154178072)0.759:0.0075908841)0.0:5.02E-7,((ATU4C_1_SLYA:1.0E-10,ATU4B_1_SLYA:1.0E-10)0.856:0.0076550764,ATU4A_1_SLYA:5.273E-7)0.836:0.0155106944)0.778:0.0102347771,ATU9A_1_SLYA:0.0053604199)0.855:0.0154048532,(ATU1C_1_SLYA:1.0E-10,(ATU1B_1_SLYA:1.0E-10,ATU1A_1_SLYA:1.0E-10)0.0:1.0E-10)0.886:0.0155730925)0.433:0.0095589268,((ATU8A_1_SLYA:0.0077396087,(AGRT5_1_PE1067:7.681E-7,ATU3A_1_SLYA:0.0077792728)0.0:5.88E-7)0.0:5.325E-7,ATU6A_1_SLYA:0.0156634547)0.745:0.0142423688)0.999:0.4409643582,AGRRK_1_PE1234:0.3312947792):0.1093171772,(SINMW_1_PE731:0.0740601495,SIMEL1_2_PE1136:0.0387884966):0.2577709492)0.685:0.09391693105,(((AGRVS_2_PE568:0.2993867508,AGRRK_1_PE1507:0.2774220742):0.4824473097,ATU4A_1_PE3006:0.9582788272):1.047515644,ATU1C_3_PE463:1.1638138625):0.09391693105);

I cannot see what is the problem with the input tree that could lead tpms to overlook this solution....

tbigot commented 12 years ago

Maybe I got a track: even if, at the end, I root using branches, I only test nodes as roots, ignoring themselves.

I have to change this behaviour, cause we can miss some optimal results.

Thanks Again, Mr Lassalle.

flassalle commented 12 years ago

I do the secretary for our talk (helps me reminding what it does): so for each node, you consider the parental branch as a root to test (and the current root is ignored as it is not a "real" node and has no prarental branch, and will be tested anyway when looking at its direct child nodes). for each considered branch, you compute the taxonomy/uicity score of neighboring nodes (the focal node and its father) and propagate the computation in both direction.

I'm not certain to be clear or even right, but Mr Bigot will certainly correct me if necessary.

Thank you again Mr Bigot for these marvelous algorithms.

tbigot commented 12 years ago

Your reminder is right. I’ll comit the new version very soon.

tbigot commented 12 years ago

Please try and re-open the bug if it does not work! Thanks for your report.