dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.
http://dgl.ai
Apache License 2.0
13.48k stars 3.02k forks source link

Add sanity check for add_edges and add_nodes #2748

Open noncomputable opened 3 years ago

noncomputable commented 3 years ago

🐛 Bug

I call g.remove_nodes() to remove 1 node from certain graphs, and python crashes (or the ipython kernel dies).

My event logs show an exception code 0xc0000409, which according to this is a STATUS_STACK_BUFFER_OVERRUN. I tracked down the error to a call to _CAPI_DGLHeteroVertexSubgraph(self, vids) here: https://github.com/dmlc/dgl/blob/8a07ab77376a99b7114d0850ff99331ed88a648e/python/dgl/heterograph_index.py#L803

The problem is not the size of the graphs, as I am able to create and remove nodes from much larger graphs than those causing this problem.

To Reproduce

Steps to reproduce the behavior:

  1. Generate a problematic graph. I uploaded one here: https://drive.google.com/file/d/1jUNZLx2O4Y3lemy8EVAgUP4Zy5MmyCkz/view?usp=sharing
  2. g = dgl.load_graphs("debug_graphs.bin")[0][0]
  3. g.remove_nodes([torch.tensor(49)])

Other methods like graph.in_edges seem to cause similar crashes.

If this is an issue with how graph.remove_nodes and other methods are implemented, that should be fixed. If this is an issue with the structure of these graphs, dgl should identify these structural problems when adding a node or edge and throw an informative exception.

Environment

noncomputable commented 3 years ago

I found the source: edges with negative indices unintentionally getting added with g.add_edges. g.add_edges should throw an exception if any of the provided indices are negative.

BarclayII commented 3 years ago

Good catch! Mind if you add a PR?

decoherencer commented 2 years ago

For g.add_edges "The IDs of the new edges will start from g.num_edges(etype)", so I am not sure how the indices end up negative, as the num_edges >=0