veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
209 stars 69 forks source link

Difficulty altering tree in HBL #662

Closed halabikeren closed 6 years ago

halabikeren commented 6 years ago

Hey dear team,

I am trying to alter a tree in HBL (specifically, adding an internal node with a single child and assigning length to the branch whose upper edge is the new internal node).

Here is an example of how I'm trying to do so:

/* attempt to edit Tree instances */

/* toy example for usage: with branch lengths */
fprintf(stdout, "\n**** test with original branch lengths in tree ****\n");
Tree T1 = "((S1:0.1,S2:0.1)Node1:0.1,S3:0.1,S4:0.2);"; // maybe needs to be a topology instance - try Tree first
fprintf (stdout, "tree before edit: ", Format (T1,1,1), "\n");
T1 + {"PARENT" : "internal1" , "WHERE" : "S1", "LENGTH" : 0.42}; // branch lengths are not conserved
fprintf (stdout, "tree after edit: ", Format (T1,11,1), "\n");

/* toy example for usage: without branch lengths */
fprintf(stdout, "\n**** test with no original branch lengths in tree ****\n");
Tree T2 = "((S1,S2)Node1,S3,S4);"; // maybe needs to be a topology instance - try Tree first
fprintf (stdout, "tree before edit: ", Format (T2,1,1), "\n");
T2 + {"PARENT" : "internal1" , "WHERE" : "S1", "LENGTH" : 0.42}; // branch lengths are not conserved
fprintf (stdout, "tree after edit: ", Format (T2,1,1), "\n");

This yields:

**** test with original branch lengths in tree ****
tree before edit: ((S1:0.1,S2:0.1)Node1:0.1,S3:0.1,S4:0.2)
tree after edit: (((S1:0.1)internal1:-1,S2:0.1)Node1:0.1,S3:0.1,S4:0.2)

**** test with no original branch lengths in tree ****
tree before edit: ((S1:-1,S2:-1)Node1:-1,S3:-1,S4:-1)
tree after edit: (((S1:-1)internal1:-1,S2:-1)Node1:-1,S3:-1,S4:-1)

This is what I'm expecting to receive:

**** test with original branch lengths in tree ****
tree before edit: ((S1:0.1,S2:0.1)Node1:0.1,S3:0.1,S4:0.2)
tree after edit: (((S1:0.42)internal1:-1,S2:0.1)Node1:0.1,S3:0.1,S4:0.2)

**** test with no original branch lengths in tree ****
tree before edit: ((S1:-1,S2:-1)Node1:-1,S3:-1,S4:-1)
tree after edit: (((S1:0.42)internal1:-1,S2:-1)Node1:-1,S3:-1,S4:-1)

Am I not using this syntax correctly? Additionally, when adding a node with a name that has a label, the node name is reformatted. For example, node name I0{Test} is reformatted into I0_Test_ when added to the tree. Is there any way to avoid this issue, or alternatively, add the labels to the nodes after altering the tree?

Many thanks! Keren

spond commented 6 years ago

Dear @halabikeren,

The most general form of Tree + object will splice the target branch and like so:


Splice the target branch with existing length 

Before

       |-----L-----| [B] ...
 ...---|
       |--------- ...

T + {"WHERE": "B", "NAME" : "new_child", "PARENT_NAME" : "splice_parent", "LENGTH" : x, "PARENT_LENGTH" : y}

After:

       |-y-|-L-----| [B] ...
       |   |
       |   |--x---- [new_child]
 ...---|
       |--------- ...

Omitting NAME or PARENT_NAME (but not both!) will split the existing branch into two components (with a single node in between).

None of these operations will ever change the length of the exiting branch (L).

Best, Sergei

halabikeren commented 6 years ago

Thank you for the clarification, Sergei!

Additionaly, when I try to add a node with a name tthat includes a label, the node name is reformatted.

This example:

fprintf(stdout, "**** add a new parent of B which will be a child of A ****\n"); 
Tree T1 = "(((B:1,C:1)A:1),D:1);";
fprintf(stdout, "tree before edit: ", Format(T1,1,1), "\n");
T1 + {"WHERE": "B", "PARENT" : "parent{FG}", "LENGTH" : 0.42, "PARENT_LENGTH" : 0.042};
fprintf(stdout, "tree after edit: ", Format(T1,1,1), "\n\n");

Yiels:

**** add a new parent of B which will be a child of A ****
tree before edit: ((B:1,C:1)A:1,D:1)
tree after edit: (((B:1)parent_FG_:0.042,C:1)A:1,D:1)

Is there any way to avoid this re-formatting, or alternatively, add the labels to the nodes after altering the tree?

Thanks! Keren

spond commented 6 years ago

Dear @halabikeren,

HyPhy will not parse meta information when you call Tree + dict. In order to assign a model to a specific tree branch, please use something like the following code.


Model BSREL = (...);

...

SetParameter (T1.parent, MODEL, BSREL)

Also note that you can directly manipulate branch -> class (e.g. FG/BG) assignment as it is simply a dictionary.

Best, Sergei

halabikeren commented 6 years ago

Dear @spond,

Thank you for the great suggestion!

Since I need to convert the tree instance to a dictionary anyway, this could be a good solution for my case.

I found a way to do that using the function trees.ExtractTreeInfo():

LoadFunctionLibrary("libv3/tasks/trees.bf");
Tree T = (((A:1,B:1)C:2),D:3);
TStr = Format(T, 1, 1);
TDict = trees.ExtractTreeInfo(TStr);
fprintf(stdout, "TStr: ", TStr, "\n"); 
fprintf(stdout, "TDict: ", TDict, "\n");

This yields:

TStr: ((A:1,B:1)C:2,D:3)
TDict: {
 "string":"(A,B,D)",
 "string_with_lengths":"(A:1,B:1,D:3)",
 "branch length":{
   "A":1,
   "B":1,
   "D":3
  },
 "annotated_string":"(A,B,D);",
 "model_map":{
   "A":"",
   "B":"",
   "D":""
  },
 "partitioned":{
   "A":"leaf",
   "B":"leaf",
   "D":"leaf"
  },
 "model_list":{
  {""}
  }
}

Which is exactly what I need.

I have three questions regarding this procedure: 1) In the process of converting the string to a dictionary, the internal nodes appear to be lost. Is there any way to preserve them? 2) Allegedly, now I could alter the model_map and model_list entries in the dictionary, in order to add the labels. However, altering the "annotated_string" entry is a bit more challenging. Is it OK to ignore this entry, and rely solely on the model_map for the labeling (even though it would create a contradiction in the information of the annotated string and the model map)? 3) Since I'm already using the dictionary as the data structure representing the tree, can I "edit" branch lengths using the "branch length" entry in it, rather than editing them using string regex and sub-string replacement?

Thank you! Keren