networkx / networkx

Network Analysis in Python
https://networkx.org
Other
14.96k stars 3.25k forks source link

Feature request - dynamic topological sort #3729

Open iljau opened 4 years ago

iljau commented 4 years ago

Currently there exists lexicographical_topological_sort, which can be used to sort DAG in topological order.

For some cases is may be useful to keep list sorted in topological order in case of node/edge additions and removals. Feature request is to implement such "dynamic topological sort".

Found one paper describing algorithm: David J. Pearce and Paul H. J. Kelly. A dynamic topological sort algorithm for directed acyclic graphs: https://www.doc.ic.ac.uk/~phjk/Publications/DynamicTopoSortAlg-JEA-07.pdf

Source code (by same author) for various "dynamic topological sort" algorithms in C++: https://github.com/DavePearce/DynamicTopologicalSort/tree/master/src/oto-test

There also appears to be Java implementation of mentioned paper: https://jgrapht.org/javadoc/org/jgrapht/graph/DirectedAcyclicGraph.html https://github.com/jgrapht/jgrapht/blob/master/jgrapht-core/src/main/java/org/jgrapht/graph/DirectedAcyclicGraph.java

AbhayGoyal commented 3 years ago

Hey, I can work on this

kpetridis24 commented 2 years ago

@iljau @MridulS I started working on this. I will let u know about any possible questions

kpetridis24 commented 2 years ago

@iljau My first question is this.

The paper talks about the topological ordering. It assumes that ord(x) for a node x is known but it never explains how. Is there a function to compute the ord(x)? When i was thinking this at first i thought that, well, we have for example 1->2->3->4 and we want to add 5->2. So we

  1. DFS for 2
  2. Reverse DFS for 5 (let's assume that 6->5 for a node 6)
  3. Perform the re-ordering However the paper has its individual implementation of DFS where it checks if the ord(y) < ord(x) for the new node y. So i have to integrate this check into the original DFS but what is the ord function and how should i implement it? Any ideas?
dschult commented 2 years ago

I think the algorithm you are working with assumes you've got the topological order stored alongside the graph, so when any graph change occurs, you update the order and thus still have the correct order each time you change the graph. That's what would make it "dynamic". It slows down the process of adding nodes or edges, but might result in speed-ups if you have to compute the order many times while a graph is changing.

kpetridis24 commented 2 years ago

The thing is, when i try to add a new edge (x,y) with x->y, the algorithm needs to know the ord(x), but we do not acquire such a function. Even the functions implemented till this point need to access all nodes to sort them and there is no function called to return the ord of x immediately. However in original topological sorting, x->y implies that x always comes before y and ord(x) < ord(y). Does the paper means it this way? This approach does not include any function 'ord', it only takes into consideration the fact that x points to y, so it has to be placed before it. Am i missing smth? @dschult

kpetridis24 commented 2 years ago

For example i have 1->2->3->4->5 6->7 Adding 8 such that 7->8 and 8->3 i need to get [1 2 6 7 8 3 4 5] By separating sB = [6 7 8] which is returned by the backward DFS traversal sF = [3 4 5] from the forward DFS traversal Placing them in topological order [6 7 8 3 4 5] and merging them in the initial array

This didn't take into consideration any function 'ord'. However it computed the correct result just by looking at who points to who

iljau commented 2 years ago

Some implementations of that algorithm are listed in "Notes" section: https://whileydave.com/publications/pk07_jea/

kpetridis24 commented 2 years ago

The first implementation i made seems to work. However, it generates a new topological sorting, which is different from that, generated by the already implemented functions. As i understand, topological ordering is not unique

for example: (0 and 1)->2->3->4->5

both 0 and 1 point to 2. So both topological orderings [0 1 2 3 4 5] and [1 0 2 3 4 5] are correct.

Is this valid?