Qiskit / rustworkx

A high performance Python graph library implemented in Rust.
https://www.rustworkx.org
Apache License 2.0
1.08k stars 148 forks source link

QUESTION: are methods like dijkstra_shortest_path_lengths sorted? #1191

Closed ConstantinoSchillebeeckx closed 5 months ago

ConstantinoSchillebeeckx commented 5 months ago

What is the expected enhancement?

This is a question and potentially an enhancement to documentation; my use case is that I'm trying to find the furthest ancestor for a given node in a graph. For example in the following graph:

image

the node furthest from node-8 is node-0

I'm doing something like:


import rustworkx as rx

graph = rx.PyDiGraph(check_cycle=True, multigraph=False)
graph.extend_from_edge_list(edges) # assume edges is a list of tuples

for k,v in rx.dijkstra_shortest_path_lengths(graph, 8, lambda x: 1).items():
    print(k, v)

I'm noticing that the first iteration of my for loop above seems to return the node furthest from node 8 (similarly for a method like k_shortest_path_lengths) - is that guaranteed, or just an artifact of my small dataset?

IvanIsCoding commented 5 months ago

Our library is implemented in Rust, which has a few quirks. One of the main differences is that, in Python, dictionaries preserve insertion order by default. In Rust, however, we need to explicitly choose to do so.

With that being said, the only guarantee we promise is that:

# both loops return the same order
paths =  rx.dijkstra_shortest_path_lengths(graph, 8, lambda x: 1)
for k,v in paths.items():
    # loop 1
    print(k, v)
for k,v in paths.items():
    # loop 2
    print(k, v)

The loops will print the same thing. This looks silly but #347 was a real issue we dealt with.

However, due to implementation details, dijkstra_shortest_path_lengths preseves the insertion order of whatever we put on it. Currently, we insert the nodes in the order they appear in the shortest path so that is the output you see. But we have no test for that! We only test if the distances are correct.

So, I higly recommend you implement something like:

paths = rx.dijkstra_shortest_path_lengths(graph, 8, lambda x: 1).items()
sorted_paths = sorted(paths, key= lambda x: x[1])

To guarantee the paths are ordered between version updates. Also, if the items are already sorted Python will have an optimization so it shouldn't be too bad

If you just need the furthest node, the following also works:

paths = rx.dijkstra_shortest_path_lengths(graph, 8, lambda x: 1).items()
furthest_path = max(paths, key=lambda x: x[1])