geneontology / noctua

Graph-based modeling environment for biology, including prototype editor and services
http://noctua.geneontology.org/
BSD 3-Clause "New" or "Revised" License
38 stars 12 forks source link

In some cases, there are issues around folding #325

Closed kltm closed 8 years ago

kltm commented 8 years ago

@cmungall uncovered a bug due to under-specified behavior in the folding algorithm when working on noctua-obo models. To quote myself from the email thread:

I thought about this a little bit and I this and it's actually doing
mostly the right thing (hang on); I believe it's just that we have some
unspecified behavior in the folder because GO models have tended to be
little more strict/structured about where stuff can be.

Basically, when the folder starts walking, it does so essentially
randomly (ignoring leaves) without coloring. Once neighbors are
disconnected, an upstream node can then fold up the node that was just
folded into. However, since this is not systematic, all sorts of weird
random-looking thing can happen when folding up. GO models tend not to
do things like this, so we've heard no complaints.

For all of the use cases I can think of right now, the correct tweak
would be to 1) color so that once a node contains a subgraph (has been
folded) it cannot be folded itself into another node.

There is a bug here too, when unfolding: after a few cycles, since the
unfolding algorithm isn't expecting nested subgraphs (IIRC), edges and
nodes can get disconnected and lost, require a re-request of the fresh
model. That is pretty bad--all ops should be 100% reversible--but I
believe very easy to fix.

A question then, before I write up the ticket. For the current rabbit
model, I've added another has_part. What would be the best thing to see?
I currently believe the most "satisfying" to be: three nodes containing
folded-in items: [set of upper jaw teeth], [mouth], and [tongue].
Another option would be to start folding at the leaves, have a more
nested internal node listing, and /allow/ non-random sub(-sub)-folding.
I think both would be legit.

After talking to @cmungall a little, we're going to punt on the exact "right" thing to do for the time being (this will also slow down #141, as the implementation will greatly affect that and I don't want to rewrite that area twice). This is, of course, an upstream bug, either in https://github.com/berkeleybop/bbop-graph or https://github.com/berkeleybop/bbop-graph-noctua, but since nobody really hangs out over there, I'll do main tracking here.

kltm commented 8 years ago

For the time being, I just want to make sure that no edges or nodes can be lost during "bad" transformations--even if we're inconsistent, we'll at least be safe.

kltm commented 8 years ago

I assumed that this was an error in unfold(), but that is looking pretty solid (and the tests so far back that up). Now looking at fold_go_noctua() (as fold_evidence() is looking sold from the tests as well).

kltm commented 8 years ago

Using report_state()--great function for probing btw--it is almost certain that 1) the error occurs while folding and 2) some kind of variable leak is occurring.

In the captured subgraph, two edges that should be there go missing and one edge that should not be there appears. This extra edge is referring to something that is not folded into the subgraph.

kltm commented 8 years ago

@cmungall Okay, as far as I can tell, while the system is still not consistent (i.e. it is possible to get different folds), it is now safe. This bump has been put out to noctua-obo; let me know if there are any other problems.

kltm commented 8 years ago

Will punt the deeper folding issues elsewhere.

kltm commented 8 years ago

[Drops :microphone:, rolls off of couch]

cmungall commented 8 years ago

πŸ‘ πŸ‘ πŸ‘ it works!

kltm commented 8 years ago

@cmungall There were some remaining weirdnesses with the "consistent" folding; I went ahead and removed sub-folding for the time being--the rules should be obvious now.