Open LA11131110128 opened 1 year ago
Hi @LA11131110128,
It looks like the error comes from the mismatch between GNN and your customized data (though I am not clear where exactly it is). I would suggest to check the defined GIN architecture (input_node_dimension, etc) and confirming it matches your defined data
.
Also maybe print out the shapes of x, edge_index
to see whether the maximum edge index exceeds the node number.
I have tried my dataset on your unsupervised learning framework, which num_of_edge will exceed 10^6. When I load the data, there is an assertion error.
loading GCC 7.3.1 based on SCL Developer Toolset 7
loading CUDA 10.1 with cuDNN / NCCL based on cntr cuda:10.1-cudnn7-devel-centos7
/pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [21,0,0], thread: [6,0,0] Assertion
srcIndex < srcSelectDimSize
failed. /pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [21,0,0], thread: [7,0,0] AssertionsrcIndex < srcSelectDimSize
failed. /pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [21,0,0], thread: [51,0,0] AssertionsrcIndex < srcSelectDimSize
failed. /pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [20,0,0], thread: [88,0,0] AssertionsrcIndex < srcSelectDimSize
failed. /pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [20,0,0], thread: [89,0,0] AssertionsrcIndex < srcSelectDimSize
failed. /pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [20,0,0], thread: [90,0,0] AssertionsrcIndex < srcSelectDimSize
failed. Processing... Done! 5264 1lr: 0.01 num_features: 1 hidden_dim: 32 num_gc_layers: 4
dataset_num_classes: 7 Traceback (most recent call last): File "gsimclr.py", line 189, in
emb, y = model.encoder.get_embeddings(dataloader_eval)
File "/home/u8411596/GraphCL-master/unsupervised_TU/gin.py", line 83, in getembeddings
x, = self.forward(x, edge_index, batch)
File "/home/u8411596/GraphCL-master/unsupervised_TU/gin.py", line 56, in forward
x = F.relu(self.convs[i](x, edge_index))
File "/home/u8411596/.conda/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, *kwargs)
File "/home/u8411596/.conda/envs/py36/lib/python3.6/site-packages/torch_geometric/nn/conv/gin_conv.py", line 67, in forward
out += (1 + self.eps) x_r
RuntimeError: CUDA error: device-side assert triggered
I am wondering the learning framework may have length of data limitation and want some suggestion from you to solve this problem. Thank you!