kexinhuang12345 / DeepPurpose

A Deep Learning Toolkit for DTI, Drug Property, PPI, DDI, Protein Function Prediction (Bioinformatics)
https://doi.org/10.1093/bioinformatics/btaa1005
BSD 3-Clause "New" or "Revised" License
962 stars 272 forks source link

Error when using DGL_GCN #142

Closed hyojin0912 closed 2 years ago

hyojin0912 commented 2 years ago

Hello, I got below errors when constructing DTI predicting model using GCN implemented in DeepPurpose (DGL)

Running Code

from DeepPurpose import utils, dataset
from DeepPurpose import DTI as models
from DeepPurpose.utils import *
from DeepPurpose.dataset import *
import warnings
warnings.filterwarnings("ignore")

drug_encoding = 'DGL_GCN'
target_encoding = 'Transformer'
X_drugs, X_targets, y= dataset.read_file_training_dataset_drug_target_pairs('./toy_data/DeepPurpose_Inhouse.txt')
y = y.astype(int)

train, val, test = data_process(X_drugs, X_targets, y, 
                                drug_encoding, target_encoding, 
                                split_method='random',frac=[0.4,0.3,0.3])

config = generate_config(drug_encoding = drug_encoding, 
                         target_encoding = target_encoding, 
                         train_epoch = 1, 
                         LR = 0.001, 
                         batch_size = 128,
                         result_folder = "./BI/GCN_Transformer_Inhouse/"
                        )

device = torch.device(f'cuda:{GPU_NUM}' if torch.cuda.is_available() else 'cpu')
torch.cuda.set_device(device) # change allocation of current GPU
model = models.model_initialize(**config)
model.train(train,val, test)

Error Message

Drug Target Interaction Prediction Mode...
in total: 34308 drug-target pairs
encoding drug...
unique drugs: 92
encoding protein...
unique target sequence: 381
splitting dataset...
Done.
Using backend: pytorch
Let's use 8 GPUs!
--- Data Preparation ---
--- Go for Training ---
Traceback (most recent call last):
  File "GCN_test.py", line 29, in <module>
    model.train(train,val, test)
  File "/mnt/hyojin0912/DeepPurpose-master/DeepPurpose/DTI.py", line 436, in train
    score = self.model(v_d, v_p)
  File "/mnt/hyojin0912/anaconda3/envs/DeepPurpose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/hyojin0912/anaconda3/envs/DeepPurpose/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/mnt/hyojin0912/anaconda3/envs/DeepPurpose/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/mnt/hyojin0912/anaconda3/envs/DeepPurpose/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
    output.reraise()
  File "/mnt/hyojin0912/anaconda3/envs/DeepPurpose/lib/python3.6/site-packages/torch/_utils.py", line 434, in reraise
    raise exception
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/mnt/hyojin0912/anaconda3/envs/DeepPurpose/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
    output = module(*input, **kwargs)
  File "/mnt/hyojin0912/anaconda3/envs/DeepPurpose/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/hyojin0912/DeepPurpose-master/DeepPurpose/DTI.py", line 52, in forward
    v_f = torch.cat((v_D, v_P), 1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 128 but got size 2 for tensor number 1 in the list.
kexinhuang12345 commented 2 years ago

Sorry for the late reply. it seems there are bugs on multi-gpu for DGL based encoders. Could you try 1 GPU for now since your dataset looks small?