pathpy / pathpyG

GPU-accelerated Next-Generation Network Analytics and Graph Learning for Time Series Data on Complex Networks.
https://www.pathpy.net
GNU Affero General Public License v3.0
24 stars 0 forks source link

How should we handle `num_nodes`? #171

Open M-Lampert opened 2 months ago

M-Lampert commented 2 months ago

Currently the method MultiOrderModel.lift_order_edge_index(...) requires num_nodes as argument. If you set this incorrectly, you will get an error as follows:

IndexError                                Traceback (most recent call last)
Cell In[33], line 10
      7 m = pp.MultiOrderModel.from_DAGs(toy_paths, max_order=max_order, mode = "propagation")
      9 edge_index = m.layers[1].data.edge_index
---> 10 m.lift_order_edge_index(edge_index, edge_index.shape[1])

File /workspaces/pathpyG/src/pathpyG/core/MultiOrderModel.py:64, in MultiOrderModel.lift_order_edge_index(edge_index, num_nodes)
     61 outdegree = degree(edge_index[0], dtype=torch.long, num_nodes=num_nodes)
     62 # Map outdegree to each destination node to create an edge for each combination
     63 # of incoming and outgoing edges for each destination node
---> 64 outdegree_per_dst = outdegree[edge_index[1]]
     65 num_new_edges = outdegree_per_dst.sum()
     66 # Create sources of the new higher-order edges

IndexError: index 3 is out of bounds for dimension 0 with size 3

This happened to @VincenzoPerri and will probably also happen to other users in the future. We should think about doing some kind of check and a warning or setting a default for easier usability.

IngoScholtes commented 1 month ago

The function Graph.from_edge_index now accepts a parameter num_nodes

IngoScholtes commented 1 month ago

I do not quite follow at which point this issue occurs, i.e. where the specification of num_nodes in the MultiOrderModel class is missing. This does not seem to be related to the Graph class?

Could you create a minimal example that reproduces the error, so I can debug this?

M-Lampert commented 1 month ago

The issue was here: https://github.com/pathpy/pathpyG/blob/da86884751540e91e63233a3fb02af1eb3d883ef/src/pathpyG/core/MultiOrderModel.py#L51 MultiOrderModel.lift_order_edge_index(...) requires the num_nodes parameter and if it is set correctly, then everything works as expected. But @VincenzoPerri 's problem was that he specified the number of edges instead. I took this as a precedent that the current implementation is not as user-friendly as it could be. But I am unsure of what to change.