pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
21.41k stars 3.67k forks source link

`Segmentation Fault` in `HeteroData` and `LinkNeighborLoader` #7663

Closed denadai2 closed 1 year ago

denadai2 commented 1 year ago

🐛 Describe the bug

Using this code down here I have Segmentation Fault. Is there any specific reason behind it?

import argparse
import os.path as osp

import torch
import torch.nn.functional as F
from torch.nn import Linear

import torch_geometric.transforms as T
from torch_geometric.datasets import MovieLens
from torch_geometric.nn import SAGEConv, to_hetero

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

path = osp.join('data/MovieLens')
dataset = MovieLens(path, model_name='all-MiniLM-L6-v2')
data = dataset[0].to(device)

# Add user node features for message passing:
data['user'].x = torch.eye(data['user'].num_nodes, device=device)

# Add a reverse ('movie', 'rev_rates', 'user') relation for message passing:
data = T.ToUndirected()(data)

# Perform a link-level split into training, validation, and test edges:
train_data, val_data, test_data = T.RandomLinkSplit(
    num_val=0.1,
    num_test=0.1,
    neg_sampling_ratio=0.0,
    edge_types=[('user', 'rates', 'movie')],
    rev_edge_types=[('movie', 'rev_rates', 'user')],
)(data)

print("data")
print()
print("train_data")
print()
print(val_data)

from torch_geometric.loader import LinkNeighborLoader

for d in [train_data, val_data]:
    loader = LinkNeighborLoader(
        d,
        num_neighbors=[10] * 2,
        batch_size=1,
        edge_label_index=(('user', 'rates', 'movie'), d[tuple(['user', 'rates', 'movie'])].edge_label_index),
    )

    count = 0
    for x in loader:
        print(count)
        count+=1

    print(count)

Output:

data
HeteroData(
  movie={ x=[9742, 404] },
  user={
    num_nodes=610,
    x=[610, 610]
  },
  (user, rates, movie)={
    edge_index=[2, 100836],
    edge_label=[100836]
  },
  (movie, rev_rates, user)={
    edge_index=[2, 100836],
    edge_label=[100836]
  }
)

HeteroData(
  movie={ x=[9742, 404] },
  user={
    num_nodes=610,
    x=[610, 610]
  },
  (user, rates, movie)={
    edge_index=[2, 80670],
    edge_label=[80670],
    edge_label_index=[2, 80670]
  },
  (movie, rev_rates, user)={
    edge_index=[2, 80670],
    edge_label=[80670]
  }
)

HeteroData(
  movie={ x=[9742, 404] },
  user={
    num_nodes=610,
    x=[610, 610]
  },
  (user, rates, movie)={
    edge_index=[2, 80670],
    edge_label=[10083],
    edge_label_index=[2, 10083]
  },
  (movie, rev_rates, user)={
    edge_index=[2, 80670],
    edge_label=[80670]
  }
)
Segmentation fault

Environment

denadai2 commented 1 year ago

this does not happen in my mac

denadai2 commented 1 year ago

the problem disappears when I use d[tuple(['user', 'rates', 'movie'])].edge_label_index.cpu()

rusty1s commented 1 year ago

This should be fixed in master. Previously, LinkNeighborLoader indeed expected CPU-based edge_label_index. Feel free to re-open if the issue is not yet resolved.

riyajatar37003 commented 2 months ago

from torch_geometric.data import Data throwing following error

/tmp/.local/lib/python3.10/site-packages/torch_geometric/typing.py:54: UserWarning: An issue occurred while importing 'pyg-lib'. Disabling its usage. Stacktrace: libcudart.so.11.0: cannot open shared object file: No such file or directory warnings.warn(f"An issue occurred while importing 'pyg-lib'. " Segmentation fault (core dumped)