pyg-team / pytorch_geometric

Graph Neural Network Library for PyTorch
https://pyg.org
MIT License
21.16k stars 3.64k forks source link

Error in `scatter_sum` with pre-transform that adds a node to the graph #2083

Closed LindaSt closed 3 years ago

LindaSt commented 3 years ago

❓ Questions & Help

Hi!

I get an error I really don't understand in scatter_sum that only happens with a new transform that I made.

ERROR:

RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "/Users/linda/miniconda2/envs/GraphNeuralNetworkExperiments/lib/python3.8/site-packages/torch_scatter/scatter.py", line 12, in scatter_sum
                out: Optional[torch.Tensor] = None,
                dim_size: Optional[int] = None) -> torch.Tensor:
    index = broadcast(index, src, dim)
            ~~~~~~~~~ <--- HERE
    if out is None:
        size = list(src.size())
  File "/Users/linda/miniconda2/envs/GraphNeuralNetworkExperiments/lib/python3.8/site-packages/torch_scatter/utils.py", line 13, in broadcast
    for _ in range(src.dim(), other.dim()):
        src = src.unsqueeze(-1)
    src = src.expand_as(other)
          ~~~~~~~~~~~~~ <--- HERE
    return src
RuntimeError: The expanded size of the tensor (212) must match the existing size (196) at non-singleton dimension 0.  Target sizes: [212, 128].  Tensor sizes: [196, 1]

PRE-TRANSFORM:

class CentralNode(object):
    def __call__(self, data):
        # add the central node
        central_node_features = torch.mean(data.x, 0)
        data.x = torch.cat((data.x, central_node_features.unsqueeze(0)), 0)

        # add edges to all other nodes
        additional_edges = torch.tensor([list(range(data.num_nodes)), [data.num_nodes]*data.num_nodes])
        data.edge_index = torch.cat((data.edge_index, additional_edges), 1)

        # give it a position
        if data.pos is not None:
            central_node_pos = torch.mean(data.pos, 0)
            data.pos = torch.cat((data.pos, central_node_pos.unsqueeze(0)), 0)

        return data

    def __repr__(self):
        return '{}()'.format(self.__class__.__name__)

Without the pre-transform, everything works fine. Any help is appreciated :)! I

ldv1 commented 3 years ago

You changed the size of the features of the nodes; this is given by dataset.num_features. If your code uses this information, then this is likely to be the reason for the error.

ldv1 commented 3 years ago

My answer is wrong if transform adjusts dataset.num_features automatically. I do not know.

rusty1s commented 3 years ago

Can you show a complete example? I don't think there is anything wrong with the transform, so I guess this is likely caused by any hard-coded value in your model.

LindaSt commented 3 years ago

Btw, this was run with the following versions, running locally on CPU.

torch-cluster             1.5.8                    pypi_0    pypi
torch-geometric           1.6.3                    pypi_0    pypi
torch-scatter             2.0.5                    pypi_0    pypi
torch-sparse              0.6.8                    pypi_0    pypi
torch-spline-conv         1.2.0                    pypi_0    pypi

When I run it on the GPU cluster (same package versions though, but possibly different pytorch-lightning, etc packages) I get a different but I think related error:

Validation sanity check:   0%|          | 0/2 [00:00<?, ?it/s]/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [4,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [5,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [6,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [7,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [8,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [9,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [10,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [11,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [12,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [13,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [14,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [15,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [16,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [17,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [18,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [19,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [20,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [21,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [22,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [23,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [24,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [25,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [26,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [27,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [28,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [29,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [30,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [31,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [32,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [33,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [34,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [35,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [36,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [37,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [38,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [39,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [40,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [41,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [42,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [43,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [44,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [45,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [46,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [47,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [48,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [49,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [50,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [51,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [52,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [53,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [54,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [55,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [56,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [57,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [58,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [59,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [60,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [61,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [62,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1607370117127/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:115: operator(): block: [97,0,0], thread: [63,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
Traceback (most recent call last):
  File "/HOME/studerl/GraphNeuralNetworkExperiments/project/dualgraph_experiments_sythetic_handwriting.py", line 58, in <module>
    DualGraphExperimentsSynthetic(**kwargs).main()
  File "/HOME/studerl/GraphNeuralNetworkExperiments/project/experiments_template.py", line 124, in main
    self.one_run()
  File "/HOME/studerl/GraphNeuralNetworkExperiments/project/experiments_template.py", line 110, in one_run
    trainer.fit(self.get_model(run_id=run_id), train_loader, val_loader)
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 510, in fit
    results = self.accelerator_backend.train()
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 57, in train
    return self.train_or_test()
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 74, in train_or_test
    results = self.trainer.train()
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 532, in train
    self.run_sanity_check(self.get_model())
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 730, in run_sanity_check
    _, eval_results = self.run_evaluation(max_batches=self.num_sanity_val_batches)
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 646, in run_evaluation
    output = self.evaluation_loop.evaluation_step(batch, batch_idx, dataloader_idx)
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 180, in evaluation_step
    output = self.trainer.accelerator_backend.validation_step(args)
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 73, in validation_step
    return self._step(self.trainer.model.validation_step, args)
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/pytorch_lightning/accelerators/gpu_accelerator.py", line 65, in _step
    output = model_step(*args)
  File "/HOME/studerl/GraphNeuralNetworkExperiments/runners/graph_classification.py", line 99, in validation_step
    loss, y_hat = self.forward(batch)
  File "/HOME/studerl/GraphNeuralNetworkExperiments/runners/graph_classification.py", line 80, in forward
    y_hat = self.model(data, batch_size=target.shape[0])
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/HOME/studerl/GraphNeuralNetworkExperiments/model_modules/graph_classifiers/dual-graph-experiment-architectures.py", line 211, in forward
    x_graph = F.relu(self.conv1(x_graph, graph_edge_index))
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/torch_geometric/nn/conv/sage_conv.py", line 63, in forward
    out = self.lin_l(out)
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 93, in forward
    return F.linear(input, self.weight, self.bias)
  File "/HOME/studerl/.conda/envs/gnn/lib/python3.8/site-packages/torch/nn/functional.py", line 1690, in linear
    ret = torch.addmm(bias, input, weight.t())
RuntimeError: CUDA error: device-side assert triggered

Both times the same code, and this is the architecture:

import torch
import torch.nn.functional as F
from torch.nn import Linear
from torch_geometric.nn import global_add_pool, GINConv, GCNConv, SAGEConv

def get_message_passing_fct(message_passing_fct_name, input_size, output_size):
    if message_passing_fct_name == "GINConv":
        return GINConv(torch.nn.Sequential(Linear(input_size, output_size), torch.nn.ReLU(), Linear(output_size, output_size)))
    if message_passing_fct_name == "GCNConv":
        return GCNConv(input_size, output_size)
    if message_passing_fct_name == "SAGEConv":
        return SAGEConv(input_size, output_size)

class SingleGraph(torch.nn.Module):
    def __init__(self, message_passing_fct_name, num_features, output_channels, num_layers=3, nb_neurons=128, ablate=False, **kwargs):
        super(SingleGraph, self).__init__()
        self.hparam = {'message_passing_fct_name': message_passing_fct_name,
                       'num_features': num_features,
                       'output_channels': output_channels,
                       'num_layers': num_layers,
                       'nb_neurons': nb_neurons,
                       'model_name': self.__repr__()}

        # set upt the GNN for the graph
        self.conv1 = get_message_passing_fct(message_passing_fct_name, num_features, nb_neurons)
        self.convs = torch.nn.ModuleList()
        for i in range(num_layers - 1):
            self.convs.append(get_message_passing_fct(message_passing_fct_name, nb_neurons, nb_neurons))

        self.lin_1 = torch.nn.Linear(nb_neurons, nb_neurons)

        # final classification layer
        self.lin_2 = Linear(nb_neurons, output_channels)

    def forward(self, data, batch_size, **kwargs):
        # prepare the data
        x_graph, graph_edge_index, batch = data.x, data.edge_index, data.batch
        # graph
        x_graph = F.relu(self.conv1(x_graph, graph_edge_index))
        for conv in self.convs:
            x_graph = F.relu(conv(x_graph, graph_edge_index))
        x_graph = global_add_pool(x_graph, batch, size=batch_size)
        x_graph = F.relu(self.lin_1(x_graph))
        x_graph = F.dropout(x_graph, p=0.5, training=self.training)
        x_output = self.lin_2(x_graph)

        return x_output

    def __repr__(self):
        return self.__class__.__name__

SingleGraph('SAGEConv', num_features, output_channels, **kwargs)
rusty1s commented 3 years ago

It looks like this may be a problem with your data. Please ensure that edge_index.max() is lower than x.size(0).

LindaSt commented 3 years ago

I've checked, it is. I'm testing now with the TUDataset AIDS data, and the issue is still persisting. I think the issue is that the size of batch does not match the size of x.size(0). But I am not quite sure how my transformation is causing that.

rusty1s commented 3 years ago

You can try to explicitly setting data.num_nodes in your transform:

data.num_nodes = data.x.size(0)
LindaSt commented 3 years ago

That seems to have done the trick :)! Vielen Dank!

pinkfloyd06 commented 3 years ago

Hi @rusty1s & @LindaSt ,

I get similar error but after two epochs of training. I train my network with gcn_conv and the error comes from torch_scatter/scatter.py", line 22, in scatter_add as follow :

 0%|▎                                                                                    | 2/100 [00:00<01:47,  4.65it/s/opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:86: operator(): block: [830,0,0], thread: [48,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:86: operator(): block: [829,0,0], thread: [32,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:86: operator(): block: [829,0,0], thread: [33,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:86: operator(): block: [829,0,0], thread: [37,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:86: operator(): block: [829,0,0], thread: [41,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:86: operator(): block: [829,0,0], thread: [44,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:86: operator(): block: [829,0,0], thread: [45,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:86: operator(): block: [830,0,0], thread: [11,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:86: operator(): block: [830,0,0], thread: [15,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:86: operator(): block: [830,0,0], thread: [19,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:86: operator(): block: [830,0,0], thread: [22,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1595629403081/work/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:86: operator(): block: [830,0,0], thread: [23,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
  0%|▎                                                                                    | 2/100 [00:00<02:18,  3.61it/s]
Traceback (most recent call last):

  File "/home/anaconda3/envs/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/main.py", line 165, in forward
    edge_index,
  File "/home/anaconda3/envs/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/anaconda3/envs/lib/python3.7/site-packages/torch_geometric/nn/conv/gcn_conv.py", line 161, in forward
    self.improved, self.add_self_loops, dtype=x.dtype)
  File "/home/anaconda3/envs/lib/python3.7/site-packages/torch_geometric/nn/conv/gcn_conv.py", line 62, in gcn_norm
    deg = scatter_add(edge_weight, col, dim=0, dim_size=num_nodes)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "/home/anaconda3/envs/lib/python3.7/site-packages/torch_scatter/scatter.py", line 22, in scatter_add
            size[dim] = int(index.max()) + 1
        out = torch.zeros(size, dtype=src.dtype, device=src.device)
        return out.scatter_add_(dim, index, src)
               ~~~~~~~~~~~~~~~~ <--- HERE
    else:
        return out.scatter_add_(dim, index, src)
RuntimeError: CUDA error: device-side assert triggered

Here how l build my data :

train_folder=[ ]

for data_sample in data_list:

                           features,coordinates,edge_index,edge_weight,targets=data_sample

                           train_folder.append(
                            torch_geometric.data.Data(
                            x=features,
                            pos=coordinates, 
                            edge_index=edge_index, 
                            edge_attr=edge_weight,
                            y=targets, 
                            )
                            )

train_loader = torch_geometric.data.DataLoader(train_folder, batch_size=batch_size_train, shuffle=True)

Any cue ?

Thank you

rusty1s commented 3 years ago

Can you also confirm that edge_index.max() is lower than x.size(0)?

for data in train_folder:
    assert data.edge_index.max() < data.num_nodes
    assert data.edge_index.max() < data.x.size(0)
pinkfloyd06 commented 3 years ago

Problem solved, thank you ! it is due to a nan value in the adjacency matrix

rmunia commented 2 years ago

Can you also confirm that edge_index.max() is lower than x.size(0)?

for data in train_folder:
    assert data.edge_index.max() < data.num_nodes
    assert data.edge_index.max() < data.x.size(0)

@rusty1s Hi, my x.size(0): 9 and edge_index.max(): tensor(6742). and the edge_index.max() is variable. For every file, it changes. How to fix that?

rusty1s commented 2 years ago

Can you clarify? The assertion ensures that there exists node features for every index present in your edge_index. If your edge_index contains a value of 6742, that means you have at least 6742 nodes in your graph, thus x.size(0) should be of size 6742 as well.

rmunia commented 2 years ago

Can you clarify? The assertion ensures that there exists node features for every index present in your edge_index. If your edge_index contains a value of 6742, that means you have at least 6742 nodes in your graph, thus x.size(0) should be of size 6742 as well.

Thank you for your explanation. For this, I could solve the issue.