Open WallofWonder opened 1 year ago
I'm really sorry, but I currently don't know the reason behind this. If you want to train the network using multiple GPUs, you can add the code "model = torch.nn.DataParallel(model) model.cuda()" in the main.py file.
I'm really sorry, but I currently don't know the reason behind this. If you want to train the network using multiple GPUs, you can add the code "model = torch.nn.DataParallel(model) model.cuda()" in the main.py file.
Anyway, thank you for your prompt reply and great work.😀 I'm trying to figure it out.
I added some code in
train_step()
to monitor the gradient of parameters:And I found the parameters of
gcn1.conv
,gcn2.conv
,mlp1
andmlp2
have a gradient ofNone
Here is part of console output:And I added the same code in CLNet, and it doesn't have this issue. I don't know if this will affect performance. Moreover, when I tried to train the model with multiple GPUs, this issue became an obstacle for me.