pyg-lib neighbor_sampling issue

smziaurrashid commented 11 months ago

I was getting following error while running with pyg-lib==0.3.1 and torch-sparse==0.6.18+pt20cu118.

python main.py --data HI-Small --model gat --reverse_mp

Traceback (most recent call last):
  File "/home/zia/Multi-GNN/main.py", line 32, in <module>
    main()
  File "/home/zia/Multi-GNN/main.py", line 29, in main
    train_gnn(tr_data, val_data, te_data, tr_inds, val_inds, te_inds, args)
  File "/home/zia/Multi-GNN/training.py", line 203, in train_gnn
    sample_batch = next(iter(tr_loader))
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch_geometric/loader/base.py", line 36, in __next__
    return self.transform_fn(next(self.iterator))
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 633, in __next__
    data = self._next_data()
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 677, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch
    return self.collate_fn(data)
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch_geometric/loader/link_loader.py", line 182, in collate_fn
    out = self.link_sampler.sample_from_edges(
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch_geometric/sampler/neighbor_sampler.py", line 182, in sample_from_edges
    return edge_sample(inputs, self._sample, self.num_nodes, self.disjoint,
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch_geometric/sampler/neighbor_sampler.py", line 481, in edge_sample
    out = sample_fn(seed_dict, seed_time_dict)
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch_geometric/sampler/neighbor_sampler.py", line 209, in _sample
    out = torch.ops.pyg.hetero_neighbor_sample(
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch/_ops.py", line 502, in __call__
    return self._op(*args, **kwargs or {})
RuntimeError: pyg::hetero_neighbor_sample() Expected a value of type 'Optional[Dict[str, Tensor]]' for argument 'edge_weight_dict' but instead found type 'bool'.
Position: 8
Value: True
Declaration: pyg::hetero_neighbor_sample(str[] node_types, (str, str, str)[] edge_types, Dict(str, Tensor) rowptr_dict, Dict(str, Tensor) col_dict, Dict(str, Tensor) seed_dict, Dict(str, int[]) num_neighbors_dict, Dict(str, Tensor)? time_dict=None, Dict(str, Tensor)? seed_time_dict=None, Dict(str, Tensor)? edge_weight_dict=None, bool csc=False, bool replace=False, bool directed=True, bool disjoint=False, str temporal_strategy="uniform", bool return_edge_id=True) -> (Dict(str, Tensor), Dict(str, Tensor), Dict(str, Tensor), Dict(str, Tensor)?, Dict(str, int[]), Dict(str, int[]))
 Python error details: TypeError: 'bool' object is not iterable

python main.py --data HI-Small --model gin --emlps

Traceback (most recent call last):
  File "/home/zia/Multi-GNN/main.py", line 32, in <module>
    main()
  File "/home/zia/Multi-GNN/main.py", line 29, in main
    train_gnn(tr_data, val_data, te_data, tr_inds, val_inds, te_inds, args)
  File "/home/zia/Multi-GNN/training.py", line 203, in train_gnn
    sample_batch = next(iter(tr_loader))
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch_geometric/loader/base.py", line 36, in __next__
    return self.transform_fn(next(self.iterator))
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 633, in __next__
    data = self._next_data()
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 677, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch
    return self.collate_fn(data)
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch_geometric/loader/link_loader.py", line 182, in collate_fn
    out = self.link_sampler.sample_from_edges(
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch_geometric/sampler/neighbor_sampler.py", line 182, in sample_from_edges
    return edge_sample(inputs, self._sample, self.num_nodes, self.disjoint,
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch_geometric/sampler/neighbor_sampler.py", line 550, in edge_sample
    out = sample_fn(seed, seed_time)
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch_geometric/sampler/neighbor_sampler.py", line 282, in _sample
    out = torch.ops.pyg.neighbor_sample(
  File "/home/zia/anaconda3/envs/ibm-gnn/lib/python3.9/site-packages/torch/_ops.py", line 502, in __call__
    return self._op(*args, **kwargs or {})
RuntimeError: pyg::neighbor_sample() Expected a value of type 'Optional[Tensor]' for argument 'edge_weight' but instead found type 'bool'.
Position: 6
Value: True
Declaration: pyg::neighbor_sample(Tensor rowptr, Tensor col, Tensor seed, int[] num_neighbors, Tensor? time=None, Tensor? seed_time=None, Tensor? edge_weight=None, bool csc=False, bool replace=False, bool directed=True, bool disjoint=False, str temporal_strategy="uniform", bool return_edge_id=True) -> (Tensor, Tensor, Tensor, Tensor?, int[], int[])
Cast error details: Unable to cast True to Tensor

Without installing pyg-lib and just using torch-sparse, I am getting the following output:

Can you please direct me how to address above issues? Which specific version pyg-lib and torch-sparse do I need to install to run the model error free?

LucTuc commented 10 months ago

Hi,

As pyg-lib currently doesn't work with the python version that's required by other, more critical libraries, you can just use torch-sparse. I just updated the env.yml file and you should be able to get all the necessary packages through it.

smziaurrashid commented 9 months ago

@LucTuc Thanks for your response and updating the code. I tried your updated code but I am unable to reproduce similar result. The result I am getting is very low. Am I missing something here?

main.py --data data --model gin --reverse_mp --ego --ports

Here is the output log output.log

LucTuc commented 9 months ago

Hey, Could you include the full log statements, i.e. also everything that gets logged to the logger during data loading?

LucTuc commented 9 months ago

Hi @smziaurrashid, Your issues were most likely due to a bug I accidentally introduced a few weeks ago when adding other features. It caused the optimization to not work properly when using reverse message passing. I fixed this bug now and pushed the changes. Please pull and verify that you get the expected results on your local machine. :) Thanks for raising the issue!

IBM / Multi-GNN

pyg-lib neighbor_sampling issue #4