Your work is excellent. I encountered an issue while trying to reproduce your work, as I am unable to run it on CUDA. I can confirm that CUDA is functioning properly in my PyTorch setup.
When I attempted to print self.actor_critic_net.actor_net.shared_net.node_encoder.weight.device in the script, it displayed device(type='cuda', index=0). However, I found that the data returned by self.batch_data in the forward function of SGNNStateEncoder is on the CPU. I tried to move this data to the same device as self.node_encoder.weight.device, but found that they were still on the CPU. When I printed out self.node_encoder.weight.device again, I discovered that it was now on the CPU, which contradicts the information I printed earlier.
Could you please check if there might be some part of the model training process that moves the model back from CUDA to the CPU?
Hello,
Your work is excellent. I encountered an issue while trying to reproduce your work, as I am unable to run it on CUDA. I can confirm that CUDA is functioning properly in my PyTorch setup.
When I attempted to print
self.actor_critic_net.actor_net.shared_net.node_encoder.weight.device
in the script, it displayeddevice(type='cuda', index=0)
. However, I found that the data returned byself.batch_data
in the forward function ofSGNNStateEncoder
is on the CPU. I tried to move this data to the same device asself.node_encoder.weight.device
, but found that they were still on the CPU. When I printed outself.node_encoder.weight.device
again, I discovered that it was now on the CPU, which contradicts the information I printed earlier.Could you please check if there might be some part of the model training process that moves the model back from CUDA to the CPU?
Thank you for your assistance.