Qiskit / rustworkx

A high performance Python graph library implemented in Rust.
https://www.rustworkx.org
Apache License 2.0
1.03k stars 145 forks source link

Dataframe output of all_pairs_dijkstra_shortest_paths? #1119

Open thomasaarholt opened 6 months ago

thomasaarholt commented 6 months ago

What is the expected enhancement?

I experience that it is very slow to convert the dictionary returned by all_pairs_dijkstra_shortest_paths on a large graph into a dataframe. Unnesting the dict into three lists (of 1.6 Million rows) works well, but when I read this into a dataframe as follows, it takes an extremely long time and can crash my system.

import polars as pl
mapping = all_pairs_dijkstra_shortest_paths(...)
left, right, path = nested_dict_to_df(mapping)
df_distances = pl.DataFrame(
    [
        pl.Series(name="left", values=left, dtype=pl.Int32),
        pl.Series(name="right", values=right, dtype=pl.Int32),
        pl.Series(name="path", values=path, dtype=pl.List(Float64)),
    ]
)

I believe this is due to the dictionary entries not being contiguous in memory, or some similar effect. It would be very nice to be able to export rustworkx output directly to a dataframe for further processing.

IvanIsCoding commented 6 months ago

I think this is aligned with #1033, I don't think it will be easy to implement but it is our most requested integration.

In the meantime, maybe you can use .explode() with .apply() to see if it performs slightly better? Something along these lines of Pandas code to avoid Python loops which are known to be slow. It might be able to be ported to Polars as well.

import rustworkx as rx
import pandas as pd

graph = rx.generators.heavy_square_graph(5)
all_paths = rx.all_pairs_dijkstra_shortest_paths(graph, lambda _: 1.0)

df = pd.DataFrame(
    {
        "U": all_paths.keys(),
        "V": all_paths.values(),
    }
)

df = df.explode("V")
df["U->V"] = df.apply(lambda x: all_paths[x["U"]][x["V"]], axis=1)

If you print(df.tail()):

     U   V                                               U->V
64  64   1  (64, 44, 60, 38, 56, 13, 55, 12, 54, 32, 50, 2...
64  64   5  (64, 44, 60, 18, 59, 17, 58, 16, 57, 35, 53, 1...
64  64  25  (64, 44, 60, 38, 56, 13, 55, 12, 54, 32, 50, 6...
64  64  45  (64, 44, 60, 38, 56, 13, 55, 12, 54, 32, 50, 2...
64  64   0  (64, 44, 60, 38, 56, 13, 55, 12, 54, 32, 50, 2...