jianhao2016 / AllSet

This is the GitHub repository for our ICLR22 paper: "You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks"
MIT License
93 stars 12 forks source link

Comfusion about the data object #13

Open mgao97 opened 11 months ago

mgao97 commented 11 months ago

Hi~

I am confused about the meanings related to the data object like the data.num_hyperedges and data.edge_index? Could you please offer an simple and claear example? What's the relationship between data.num_hyperedges and data.edge_index? What if I have a new hypergraph dataset with nodes and hyperedge lists? I mean, how do I convert my dataset to the pyg data object?

elichienxD commented 11 months ago

Hi @mgao97 ,

If I remember correctly, data.num_hyperedges is the total number of the hyperedges as indicated by its name. data.edge_index characterizes the node-hyperedge relation, where data.edge_index[0] are node indices and data.edge_index[1] are hyperedge indices. It means node data.edge_index[0][i] in contained in the hyperedge data.edge_index[1][i]. See https://github.com/jianhao2016/AllSet/blob/0d0e399a9168829fa898dd56f3d32bee36953b04/src/models.py#L439 for our code description. I may forget some details so maybe @thupchnsky and @jianhao2016 can comment further.

mgao97 commented 11 months ago

Hi~

Thank you for your quick message. It is clear to me.

I have another problem when I initialize the SetGNN model, the output shows that the V2EConvs and E2VConvs are blank. When I train the model, I have encountered another problem showing below.

"ValueError Traceback (most recent call last) /users/HyperGCL/src/backbones.ipynb 单元格 44 line 1 9 st = time.time() 10 optimizer.zero_grad() ---> 11 outs = model_setgnn(data) 12 outs, lbl = outs[idx_train], lbls[idx_train] 14 loss = F.cross_entropy(outs, lbl)

File ~/miniconda/envs/hyper/lib/python3.9/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, *kwargs) 1190 # If we don't have any hooks, we want to skip the rest of the logic in 1191 # this function, and just call forward. 1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1193 or _global_forward_hooks or _global_forward_pre_hooks): -> 1194 return forward_call(input, **kwargs) 1195 # Do not call functions when jit is used 1196 full_backward_hooks, non_full_backward_hooks = [], []

/users/HyperGCL/src/backbones.ipynb 单元格 44 line 1 184 for i, _ in enumerate(self.V2EConvs): 185 print(f'1 self.aggr {self.aggr}') --> 186 x = F.relu(self.V2EConvs[i](x, edge_index, norm, self.aggr)) 187 # x = self.bnV2Esi 188 x = F.dropout(x, p=self.dropout, training=self.training) ... 198 if aggr is None: --> 199 raise ValueError("aggr was not passed!") 200 return scatter(inputs, index, dim=self.node_dim, reduce=aggr)

ValueError: aggr was not passed!" But I have checked the "aggr" value before passing it to the model, specifically, for the HalfNLHconv layer.

Could you please offer some suggestions for solving the above problem? Thank you in advance!

elichienxD commented 11 months ago

Hi @mgao97 ,

Did you use the same PyG version and setup the environment as we stated in the readme? If you want to use the newer PyG version, you may want to check #1

Best, Eli

mgao97 commented 10 months ago

Thank you for your message. I will check it later~

mgao97 commented 10 months ago

Hi~

According to your information and this line "https://github.com/jianhao2016/AllSet/blob/6281a2f1a91f6f26040777bb0b2578fc035dc57a/src/preprocessing.py#L403", it is still not right for these three variables and their relationships.

Could you please give an example of the data object and its' parameters like n_x, edge_index, hyperedges, and num_hyperedges?

elichienxD commented 10 months ago

Hi~

According to your information and this line "

https://github.com/jianhao2016/AllSet/blob/6281a2f1a91f6f26040777bb0b2578fc035dc57a/src/preprocessing.py#L403

", it is still not right for these three variables and their relationships. Could you please give an example of the data object and its' parameters like n_x, edge_index, hyperedges, and num_hyperedges?

Hi @mgao97 ,

data.n_x[0] should be the number of nodes. data.num_hyperedges[0] should be the number of total hyperedges. data.edge_index[0].max().item() is the max hyperedge id. From star expansion, we have a bipartite graph where node id 0 to n_x-1 should correspond to the original node and n_x to n_x+num_hyperedges-1 should correspond to hyperedges. You may check the figure in our paper for the illustration of the bipartite graph and star expansion.

Best, Eli