isayev / ReLeaSE

Deep Reinforcement Learning for de-novo Drug Design
MIT License
344 stars 134 forks source link

Weights are not updating #39

Open vmarar opened 3 years ago

vmarar commented 3 years ago

Hi! My weights don't seem to be changing.

I printed out the grad for each of the layers and it comes back as none, the grad of the hidden layer is none as well.

In the code the only thing I changed in the reinforcement file was rl_loss -= (log_probs[0, top_i].cpu().detach().numpy()discounted_reward) and then tried rl_loss -= (log_probs[0, top_i].item()discounted_reward) due to an issue with cuda device = 0.

Would you know any reason why grad would be coming up as none for all of the layers? I believe this is why optimizer.step() is not working and the weights are not updating.

Mariewelt commented 3 years ago

Hi @vmarar I'd say this is expected behavior. Further in the code rl_loss is backpropagated by applying rl_loss.backward(). But your changes detach rl_loss from the computational graph and there is nothing to backpropagate. Is there any specific reason why you need to do this? What kind of issues/errors do you have with your cuda device?

vmarar commented 3 years ago

would this be the case using rl_loss -= (log_probs[0, top_i].item()*discounted_reward) as well? Would there still be a detachment? The error is : can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first

I have currently have cuda visible devices set to 0. Is there a way to bypass this error?

Thank you!

Mariewelt commented 3 years ago

would this be the case using rl_loss -= (log_probs[0, top_i].item()*discounted_reward) as well? Would there still be a detachment?

Yes, this is exactly what .item() does. It returns just the value of a tensor.

The error is : can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first

Which line triggers this error? Could you please provide a traceback?

vmarar commented 3 years ago

Screenshot from 2021-01-06 10-45-15

This is the traceback, it is directly from the rl_loss line.

Our predictive model returns a numpy array, the discounted_reward is also a numpy value and not a tensor, I think that is the issue. So I'll try converting reward from numpy to a tensor, would that still lead to a disconnect in the graph?

vmarar commented 3 years ago

I tried converting from numpy to tensor and then pointing it towards cuda:0, no error but model weights are still not changing

Mariewelt commented 3 years ago

I tried converting from numpy to tensor and then pointing it towards cuda:0, no error but model weights are still not changing

Yes, that wouldn't help, because you are first detaching the loss and then creating a new node that is not connected to the existing computational graph. In this scenario, the loss is not backpropagated and that's why weights are not changing.

Let me investigate the issue. Could you please provide your environment details? The versions of PyTorch, NVIDIA driver, and what GPU you are using.

vmarar commented 3 years ago

Here are the env details: GPU : 'GeForce RTX 2080' Version of pytorch : 1.6.0 NVIDIA-SMI 450.57 Driver Version: 450.57 CUDA Version: 11.0

Details: I have a scoring script that I am using in lieu of a predictive model, it outputs a numpy array. Currently I am using output_final = torch.tensor(output, device='cuda:0') to convert to a tensor, which got rid of rl_loss error. Please let me know if there's any other details I can provide.

Thanks for your help!

vmarar commented 3 years ago

After trying some solutions, I found that after converting NumPy output to tensor output within my scoring script. AND addressing a warning of using a source tensor to create a copy in the rl_loss = torch.tensor(rl_loss, require_grad=True) line, I simply commented out this line, and weights seem to backpropagate.

Is there any reason it could be wrong to comment out that line and not create a tensor from the existing rl_loss tensor and backpropagate the grad from rl_loss = rl_loss/n_batch instead? It's working but I want to make sure that I am not backpropagating useless values.

Is it possible that by using a script in lieu of a predictive model , autograd only sees the output of the script which is in the form of a tensor? Therefore not leading to any breaks in the computational graph because it's only dealing with the output?