Shen-Lab / GraphCL

[NeurIPS 2020] "Graph Contrastive Learning with Augmentations" by Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, Yang Shen
MIT License
537 stars 103 forks source link

Unsupervised learning with self created dataset #58

Open LA11131110128 opened 1 year ago

LA11131110128 commented 1 year ago

I have tried my dataset on your unsupervised learning framework, which num_of_edge will exceed 10^6. When I load the data, there is an assertion error.


loading GCC 7.3.1 based on SCL Developer Toolset 7


loading CUDA 10.1 with cuDNN / NCCL based on cntr cuda:10.1-cudnn7-devel-centos7

/pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [21,0,0], thread: [6,0,0] Assertion srcIndex < srcSelectDimSize failed. /pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [21,0,0], thread: [7,0,0] Assertion srcIndex < srcSelectDimSize failed. /pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [21,0,0], thread: [51,0,0] Assertion srcIndex < srcSelectDimSize failed. /pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [20,0,0], thread: [88,0,0] Assertion srcIndex < srcSelectDimSize failed. /pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [20,0,0], thread: [89,0,0] Assertion srcIndex < srcSelectDimSize failed. /pytorch/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 1, SrcDim = 1, IdxDim = -2, IndexIsMajor = true]: block: [20,0,0], thread: [90,0,0] Assertion srcIndex < srcSelectDimSize failed. Processing... Done! 5264 1

lr: 0.01 num_features: 1 hidden_dim: 32 num_gc_layers: 4

dataset_num_classes: 7 Traceback (most recent call last): File "gsimclr.py", line 189, in emb, y = model.encoder.get_embeddings(dataloader_eval) File "/home/u8411596/GraphCL-master/unsupervised_TU/gin.py", line 83, in getembeddings x, = self.forward(x, edge_index, batch) File "/home/u8411596/GraphCL-master/unsupervised_TU/gin.py", line 56, in forward x = F.relu(self.convs[i](x, edge_index)) File "/home/u8411596/.conda/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, *kwargs) File "/home/u8411596/.conda/envs/py36/lib/python3.6/site-packages/torch_geometric/nn/conv/gin_conv.py", line 67, in forward out += (1 + self.eps) x_r RuntimeError: CUDA error: device-side assert triggered

I am wondering the learning framework may have length of data limitation and want some suggestion from you to solve this problem. Thank you!

yyou1996 commented 1 year ago

Hi @LA11131110128,

It looks like the error comes from the mismatch between GNN and your customized data (though I am not clear where exactly it is). I would suggest to check the defined GIN architecture (input_node_dimension, etc) and confirming it matches your defined data.

Also maybe print out the shapes of x, edge_index to see whether the maximum edge index exceeds the node number.