dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.
http://dgl.ai
Apache License 2.0
13.46k stars 3.01k forks source link

Question of running ‘apply_edges’ in a contrastive model, a potential bug? #4836

Closed buaalyx closed 1 year ago

buaalyx commented 1 year ago

I want to implement a contrastive graph model. I have an encoder for two heterogeneous graphs G_raw and G_aug, where G_aug is obtained by removing some edges from G_raw. And the specific information are as follows:

 G_raw:
 Graph(num_nodes={'author': 7167, 'paper': 4019, 'subject': 60},
      num_edges={('author', 'author-paper', 'paper'): 13407, ('paper', 'paper-author', 'author'): 13407, ('paper', 'paper-subject', 'subject'): 4019, ('subject', 'subject-paper', 'paper'): 4019},
      metagraph=[('author', 'paper', 'author-paper'), ('paper', 'author', 'paper-author'), ('paper', 'subject', 'paper-subject'), ('subject', 'paper', 'subject-paper')]) 

G_aug:
Graph(num_nodes={'author': 7167, 'paper': 4019, 'subject': 60},
      num_edges={('author', 'author-paper', 'paper'): 11627, ('paper', 'paper-author', 'author'): 11568, ('paper', 'paper-subject', 'subject'): 3225, ('subject', 'subject-paper', 'paper'): 3232},
      metagraph=[('author', 'paper', 'author-paper'), ('paper', 'author', 'paper-author'), ('paper', 'subject', 'paper-subject'), ('subject', 'paper', 'subject-paper')])

The encoder works well on G_raw with the forward process encoder(G_raw)

but I met the following error when running the forward process on G_aug with encoder(G_aug):


 File "/root/Downloads/lyx/contra/godsake/contra_hgt/model/hyp_model.py", line 100, in forward
    sub_graph.apply_edges(fn.v_add_u('q', 'k', 't'))
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/site-packages/dgl/heterograph.py", line 4463, in apply_edges
    self._set_e_repr(etid, eid, edata)
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/site-packages/dgl/heterograph.py", line 4238, in _set_e_repr
    self._edge_frames[etid].update(data)
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/_collections_abc.py", line 941, in update
    self[key] = other[key]
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/site-packages/dgl/frame.py", line 584, in __setitem__
    self.update_column(name, data)
  File "/usr/local/anaconda3/envs/gcl/lib/python3.9/site-packages/dgl/frame.py", line 661, in update_column
    raise DGLError('Expected data to have %d rows, got %d.' %
dgl._ffi.base.DGLError: Expected data to have 13407 rows, got 11627.

The corresponding code snippet is as follows:


def forward(self, G, h):
        with G.local_scope():
            for srctype, etype, dsttype in G.canonical_etypes:
                sub_graph = G[srctype, etype, dsttype]
                ......
                sub_graph.apply_edges(fn.v_add_u('q', 'k', 't')) #invoke error here

I print relevant data before sub_graph.apply_edges() and got


> subgraph:
Graph(num_nodes={'author': 7167, 'paper': 4019},
      num_edges={('author', 'author-paper', 'paper'): 11627},
      metagraph=[('author', 'paper', 'author-paper')]),  and:
sub_graph.srcdata['k'].shape: torch.Size([7167, 256])
sub_graph.dstdata['q'].shape: torch.Size([4019, 256])

I just don't know why the sub_graph with 11627 edges 'Expected data to have 13407 rows' to apply_edges()

I noticed this question: Cannot assign edge data after g.remove_edges, but I still meet the similar problems. My dgl version is

dgl-cu110                 0.8.2.post1              pypi_0    pypi
dglgo                     0.0.1                    pypi_0    pypi
buaalyx commented 1 year ago

I found another bug issue with same error message DGLError Expected data to have %d rows, got %d. occurs at large batch size, so I guess I meet another bug?

BarclayII commented 1 year ago

The same question is being asked in https://discuss.dgl.ai/t/question-of-running-apply-edges-in-a-contrastive-model-a-potential-bug/3294 so we will discuss there. I'll reopen this issue if a bug is confirmed.