OpenTreeOfLife / opentree

Opentree browsing and curation web site. For overarching or cross-repo concerns, please see the 'germinator' repo.
http://tree.opentreeoflife.org/
BSD 2-Clause "Simplified" License
111 stars 26 forks source link

Bug in conflict service. #935

Closed bredelings closed 8 years ago

bredelings commented 8 years ago

On pg_2448, node900303 has 2 children: Clostridium phytofermentans and Abiotrophia defectiva The edge is marked as Resolves Firmicutes in the conflict with the taxonomy tree.

life > cellular organisms > Bacteria > Firmicutes > Clostridia > Clostridiales > Lachnospiraceae > Anaerosporobacter > Lachnoclostridium phytofermentans

life > cellular organisms > Bacteria > Firmicutes > Bacilli > Lactobacillales > Aerococcaceae > Abiotrophia > Abiotrophia defectiva (species ncbi:46125)

So, I think this node should conflict with all the above-mentioned on the ancestry chain up to, but not including, Firmicutes.

bredelings commented 8 years ago

To show that this must conflict, suppose we have the simplified taxonomy tree ((Lp, Co)Clostridium), (Ad,Bo)Bacillus))Firmicutes. Here Lp = Lachnoclostridium phytofermentans Ad = Abiotrophia defectiva Co = other Clostridia Bo = other Bacillus The split Lp Ad | Co Bo (currently marked as Resolves Firmicutes) conflicts with Lp Co | Ad Bo (from the taxonomy node for Clostridia). Even ignoring the implicit root on the right side of the splits, each side of each split intersects both sides of the other split. So, there are 4 intersections total.

bredelings commented 8 years ago

The above comment shows that the (Ad,Lp) node conflicts with the Clostridium node, as long as that node has any children. Thus, the proof above generalizes to any taxon on the ancestry chain (from either Ad or Lp) up to (but not including) Firmicutes, as long as that taxon has more than one child.

bredelings commented 8 years ago

If I remember correctly, the resolves case and the conflicts case are pretty similar, in that both end up tagging the MRCA(of the include group) node with two marks, indicating that the MRCA(of the include group) node has children in both the include group and the exclude group of the relevant split.

A taxonomy/synth node should be marked as in conflict with the relevant input-tree node if the taxonomy/synth node has children in the include AND exclude group AND its parent is tagged as having children in the include group.

The last condition ensures that the conflicted node has some, but not all, of the include group. So, the MRCA(of the include group) node itself should fail the last condition, since it contains all of the include group as children.

jar398 commented 8 years ago

@jimallman I put the fix on devapi, but all is not OK. For this https://devtree.opentreeoflife.org/curator/study/view/pg_2448/?tab=trees&tree=tree5223&conflict=ott I'm getting "Sorry, there was an error in the conflict data" and when I "show details" there is nothing there. I wasn't getting this before.

The corresponding API call is: wget -O tmp.json "https://devapi.opentreeoflife.org/v2/conflict/compare?tree1=pg_2448%23tree5223&tree2=ott" and I don't see any errors in the resulting "conflict data".

Not sure how to debug this.

jar398 commented 8 years ago

@jimallman The ingroup had not been set. When I set the ingroup, it worked fine.

It is not obvious that conflict coloring should fail when the ingroup is not set. It would be nice if the error were a bit easier to understand.

jimallman commented 8 years ago

The ingroup had not been set. When I set the ingroup, it worked fine.

That's odd. I'll take a look.

jimallman commented 8 years ago

I've compared conflict output (vs. OTT) for this tree, with and without an ingroup, and I see no difference via cURL or in the curation app. To be specific, I removed the ingroup designation in the curation app, saved the study, and showed the conflict display as expected.

We might need to revert this study to a prior state (in phylesystem) to reproduce the original error.

jar398 commented 8 years ago

Foo. Let's say it's an unimportant transient.

jimallman commented 8 years ago

Yep. Of course, we can keep testing against other trees in the system that lack a designated ingroup.

It would be hilarious if the bug were somehow related to the number of listed curators. But I just don't see any connection, unless it were in a seriously broken NexSON validator.

jar398 commented 8 years ago

Can't see any other actions to take, so closing