thechiselgroup / biomixer

BioMixer
http://bio-mixer.appspot.com/
16 stars 13 forks source link

Edge Registry Limitations #410

Closed everbeek closed 10 years ago

everbeek commented 10 years ago

I think I need to make the edge registry a permanent registry, with no status changes. All edges need to be entered as a value for the dyad of endpoints, so for node A<->B, we need to enter their edge under A,B as well as B,A. We never remove it from the registry. We still only add edges on the basis of new edges coming in, and we still remove all linked edges when a node is removed.

Memory consumption will not grow too fast.

This was spawned from #400.

everbeek commented 10 years ago

Notes I wrote in #400, moved here:

Having trouble with empty target. It isn't in the node registry, and shouldn't be coming from the edge registry. Are the edges not being re-registered properly on node deletion? Or is this ok, but we need to check for undefined in the graph/node registry, and reject edges that fail at that point?

Verifying that registry is intended for nodes not yet in graph leading to nodes in graph. Maybe when they are re-added, it's the wrong order?? I think that when I deleted all the nodes, I re-registered edges in the registry, because the nodes were removed, so the edges become latent again...except that when I remove all endpoints, they aren't even latent edges! Ut oh! I have to clean out edges that don't have the 2nd tier index in the graph either, that is, to iterate through all registered edges and check for graph inclusion. Ugh...ok, besides that feeling stinky, I feel like it will not work, because we don't re-parse nodes (and thus their links) when we undo/redo, we use the objects themselves. This would mean that removing the edges from the registry entirely would remove the edges entirely.

Ok...so either we can check for undefined source/target where I located the bug, or we can track edges alongside nodes, and never use the edge registry when undo/redo is involved. The edge registry is still 100% required for normal forward activity, since it prevents a huge slough of unnecessary REST calls. Anyway, looking further at whether a simple null check would work, or if I should store edges in the Expansion/Deletion Sets...

Reasoning a bit: if edges were in the expansion/deletion sets, they could be added after nodes are, so long as the sets are always done in the same order, which is a critical requirement anyway, and is how they are designed already. So that looks fine. But what about the cheap workaround of a null check? That check would be, in a sense, hiding buggy broken behaviors coming out of the edge registry, in addition to allowing expansion post deletion to occur (in our case, changing graph type from "term neighborhood" to "paths to root"). So the easier way is not acceptable, and the harder way has the same assumptions, but will require additional graph code for loading edge objects from somewhere other than the registry.

For deletion, storing edges works. Undo will add them back directly. If an expansion occurs that is the other end of those edges, that expansion will get the edge re-registered. This will work. On redo, edges are still removed. For expansions though, when they are undone, the other ends of the edges are still in view, and therefore we can rely on the registry...but no we cannot! Because the expansion itself may have nodes relating to one another, and because redoing an expansion does not re-parse, then those edges would be registered with invalid targets (though they would become valid on redo of the expansion). A problem would arise if we did undo of an expansion, re-registered the edges therein, then did a different expansion that involved only one endpoint of an edge we had re-registered: it would try to manifest that edge with a missing target. So, expansions must not be re-registering edges, but must store them explicitly.

The trickiness is that the edge registry is only added to when parsing, and is removed from whenever an edge is added to the graph. The same edge could have its endpoints added via other circuitous expansions, but since the edge isn't in the registry, the edge would not manifest. Ok, this is troublesome...it is starting to look like the edge registry should indeed carry edges that are manifested and those that have been manifested, but will prevent manifestation if source or target are not present. If this is true, do I have to worry about the edge showing up in reverse target-source order when doing circuitous expansions?

[Consider reimplementing registry as a full double map that should store all edges ever encountered, never removed, perhaps never tagged, and added to graph on the basis of new nodes and other node being in the graph. Remove nodes from graph, always remove associated edges from graph and has nothing to do with the registry. I'd attach to nodes, but we need to look up by id not node object.]