Open josephwb opened 8 years ago
I guess I should wait for @jar398 to give me the go ahead on this, since people have expressed that they want this, but none of the myriad docs explicitly discusses it.
I think it's a good idea
-- apologies for brevity / using handheld gizmo --
On Feb 16, 2016, at 20:41, "Joseph W. Brown" notifications@github.com wrote:
I guess I should wait for @jar398 https://github.com/jar398 to give me the go ahead on this, since people have expressed that they want this, but none of the myriad docs explicitly discusses it.
— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/205#issuecomment-184968488 .
What methods does this affect?
"subtree" and "induced_subtree" (only the latter is implemented at the moment; I can turn it off if you like).
Old:
curl -X POST http://localhost:7474/db/data/ext/tree_of_life_v3/graphdb/induced_subtree -H "content-type:application/json" -d '{"node_ids":["ott501678","ott666104","ott316878"], "ott_ids":[102710,536234,810751,81461,501678]}'
{
"node_ids_not_in_tree" : [ "ott501678" ],
"ott_ids_not_in_tree" : [ 501678 ],
"newick" : "(((Clangula_ott316878,Perdix_ott102710)Galloanserae_ott5839486,((Selasphorus_calliope_ott536234)Trochilidae_ott810751,Setophaga_ott666104))Neognathae_ott241846)Aves_ott81461;",
"synth_id" : "opentree4.1"
}
New:
curl -X POST http://localhost:7474/db/data/ext/tree_of_life_v3/graphdb/induced_subtree -H "content-type:application/json" -d '{"node_ids":["ott501678","ott666104","ott316878"], "ott_ids":[102710,536234,810751,81461,501678]}'
{
"node_ids_not_in_tree" : [ "ott501678" ],
"ott_ids_not_in_tree" : [ 501678 ],
"newick" : "(((Clangula_ott316878,Perdix_ott102710)Galloanserae_ott5839486,((Selasphorus_calliope_ott536234)Trochilidae_ott810751,Setophaga_ott666104)mrcaott246ott5481)Neognathae_ott241846)Aves_ott81461;",
"synth_id" : "opentree4.1"
}
Thanks for the "before" and "after" examples. I was unclear on the consequences of this change. Do I understand correctly that this isn't just a matter of labeling, but of including entirely new nodes (e.g. mrcaott246ott5481
above)?
we could make it an option if there are worries about clutter or space or compatibility. {"include_nontaxon_labels" : true , ...} not sure how to decide this.
On Wed, Feb 17, 2016 at 4:51 PM, Joseph W. Brown notifications@github.com wrote:
"subtree" and "induced_subtree" (only the latter is implemented at the moment; I can turn it off if you like).
— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/205#issuecomment-185423241 .
@jimallman the trees are the exact same except for labelling.
+1 for including them Ambivalent about option to exclude. Hold off until it becomes a problem? Would be a compatible change.
"nontaxon" nodes could be part of the query; a reason for them to be returned.
the trees are the exact same except for labelling
thanks, my Newick-fu is lacking :bowtie:
Here is subtree. Before:
curl -X POST http://localhost:7474/db/data/ext/tree_of_life_v3/graphdb/subtree -H "content-type:application/json" -d '{"node_id":"ott803675"}'
{
"newick" : "(Gavia_stellata_ott1057044,((Gavia_arctica_ott1085739,Gavia_pacifica_ott651474),(Gavia_immer_ott1057518,Gavia_adamsii_ott90560)))Gavia_ott803675;",
"synth_id" : "opentree4.1"
}
After:
curl -X POST http://localhost:7474/db/data/ext/tree_of_life_v3/graphdb/subtree -H "content-type:application/json" -d '{"node_id":"ott803675"}'
{
"newick" : "(Gavia_stellata_ott1057044,((Gavia_arctica_ott1085739,Gavia_pacifica_ott651474)mrcaott651474ott1085739,(Gavia_immer_ott1057518,Gavia_adamsii_ott90560)mrcaott90560ott1057518)mrcaott90560ott651474)Gavia_ott803675;",
"synth_id" : "opentree4.1"
}
Deployed on test for testing. Let me know if you want me to revert things.
It would be compatible in the sense of having a /v3 URL instead of a /v2 URL. But it's easy to imagine clients used to the v2 Newicks barfing on the v3 Newicks. That's OK with me, just cautioning on "would be a compatible change"
But I have no strong feelings.
On Wed, Feb 17, 2016 at 5:02 PM, Joseph W. Brown notifications@github.com wrote:
"nontaxon" nodes could be part of the query; a reason for them to be returned.
— Reply to this email directly or view it on GitHub https://github.com/OpenTreeOfLife/treemachine/issues/205#issuecomment-185427642 .
Please formulate some definite feelings, strong or no; otherwise how can this possibly move forward (or not)?
Just FYI, I've imported both the old and new Newick strings from this comment above, on devtree:
https://devtree.opentreeoflife.org/curator/study/edit/tt_35/?tab=trees
Look above for the strongest uncontradicted opinion. (It's "+1 for including them Ambivalent about option to exclude. Hold off [on making them optional] until it becomes a problem".)
A different, stronger, better supported opinion might come along later, e.g. from Mark or one of our users. So just write the code so it's easy to turn on or off. No big deal.
Getting this right immediately is not as important as getting things like 'tax_sources' right immediately because we don't have code inside the project that depends on it. That's why my attitude is different on this issue compared to #207.
It is a big deal when I get lambasted for not doing exactly what is expected. The views expressed above are nowhere near the strong gatekeeper consensus you expressed during the hangout. I'll not be making any more decisions; tell me what you want, and I'll do it.
Be explicit:
Now that we're no longer pressing to get a stable API by Saturday, I suggest you not do anything until there has been proper review of a proposed spec.
Where does the spec come from? Me? Does 1-4 above, together withe examples, suffice? If not, what? How is the spec reviewed?
I will prepare a spec. It will become a github issue just like everything else. If you want to prepare a spec too, you may, and we can compare them when the time comes. But that is up to you.
Since you don't want to proceed without guidance, I have assigned this issue to me.
Current behavior: curl -X POST https://devapi.opentreeoflife.org/v3/tree_of_life/subtree -H "content-type:application/json" -d '{"node_id":"ott803675"}' { "newick" : "(Gavia_stellata_ott1057044,((Gavia_arctica_ott1085739,Gavia_pacifica_ott651474)mrcaott651474ott1085739,(Gavia_immer_ott1057518,Gavia_adamsii_ott90560)mrcaott90560ott1057518)mrcaott90560ott651474)Gavia_ott803675;"}
The answer to #1 is yes. #2 is a good question. @kcranston what do you think, should we have a flag that controls whether to get v2-like behavior (i.e. suppress those long and mostly useless internal node labels)? Having seen some of these trees I think maybe we should.
In response to @josephwb questions:
(edited to change default from true to false and clarification on what the flag does)
I think the default would be false, since in v2 we didn't have all the mrca labels.
I'm assuming the flag would only control the mrca labels, not the internal ott labels.
This way, with the flag false, the v3 behavior would be the same as the v2 behavior: yes ott internal labels, no mrca internal node labels.
Is the goal to get this implemented for the current release? Seems low priority, as anyone getting trees is likely to throw out any internal node label. Plus, it feels awkward to have 2 label args:
label_format
include_all_node_labels
I'm inclined to just return all the labels and let the user dispose of them if they wish. Getting rid of the mrcaottXXXotYYY
labels but not the ottWWW
label isn't going to help anyone get anything done.
There is code in the tree_of_life (v2) plugin in the v3 branch to throw away the mrca labels.
label_format only applies to OTT nodes. include_all_node_labels only applies to non-OTT nodes. So I see what you mean but technically they are orthogonal.
include_all_node_labels is implemented (v3 branch on devapi) as @kcranston requested above.
I'm not too happy about the flag name. I'm groping for something short along the lines of 'include_id_when_unnamed' or 'use_id_for_label_when_node_is_unnamed' since the flag says whether the id should be used as the label when the node has no name (OTT id), instead of nothing at all. (For named nodes the label_format tells what label to use.) It's possible to put a spin on 'include_all_node_labels' that makes it a correct description, but it's hard: you have to say that the a priori label is according to the format, if named, or is the id, if unnamed; then you have to say "all labels" means use the a priori label in every case, while "not all labels" means use the a priori label only when the node is named.
There are 6 combinations of format + this flag, and they all make sense, but a couple of them are rather odd.
@kcranston This issue just needs documentation and test cases and it will be done. Shall we (I) proceed?
Do we want to expose this option in the webapps, perhaps as a checkbox? Or a drop-down, if there are more than two meaningful choices.
I thought we would just leave it in the API. It's sort of an advanced feature
Sounds good to me.
Currently only taxon labels are applied to newick nodes. e.g.