Tensor error - Githubissues

steven0seagal commented 2 weeks ago

Hi! Thank you for your great work. I am working with different diffusion methods and I want to deeply test your solution. Unfortunately I cannot run inference_single script. Below my execution and error, maybe you could help me solve it.

 (DFMDock) ubuntu@ubuntu:/mnt/storage/DFMDock/src$ python3 inference_single.py ../structures/6ibb_r_u.pdb ../structures/6ibb_r_u.pdb
/home/ubuntu/anaconda3/envs/DFMDock/lib/python3.10/site-packages/torch/cuda/__init__.py:654: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
{'lm_embed_dim': 1301, 'positional_embed_dim': 66, 'spatial_embed_dim': 100, 'node_dim': 256, 'edge_dim': 128, 'inner_dim': 128, 'depth': 6, 'dropout': 0.1, 'cut_off': 20.0, 'normalize': True}
  0%|                                                                                                                                                                                                                                                                                      | 0/40 [00:00<?, ?it/s]
  0%|                                                                                                                                                                                                                                                                                      | 0/40 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/mnt/storage/DFMDock/src/inference_single.py", line 531, in <module>
    main(args)
  File "/mnt/storage/DFMDock/src/inference_single.py", line 431, in main
    rec_pos, lig_pos, rot_update, tr_update, energy, num_clashes = Euler_Maruyama_sampler(
  File "/mnt/storage/DFMDock/src/inference_single.py", line 350, in Euler_Maruyama_sampler
    output = model(batch) 
  File "/home/ubuntu/anaconda3/envs/DFMDock/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/DFMDock/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/storage/DFMDock/src/models/score_model.py", line 52, in forward
    outputs = self.net(batch, predict=True)
  File "/home/ubuntu/anaconda3/envs/DFMDock/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/DFMDock/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/storage/DFMDock/src/models/score_net.py", line 366, in forward
    edge += self.positional_embed(position_matrix)
RuntimeError: The size of tensor a (602) must match the size of tensor b (870) at non-singleton dimension 1

lchu11 commented 2 weeks ago

Thanks for pointing out the issue. I believe it's caused by the PDB reading function in inference_single.py. I've updated the code https://github.com/Graylab/DFMDock/commit/6979d30cc3e9d87fa5d370ec76422a7233731024 to filter out HETATM lines in the PDB file, and I hope this resolves the problem.

Please let me know if you encounter any further errors.

steven0seagal commented 1 week ago

Thanks for update, unfortunately I get now different error

{'lm_embed_dim': 1301, 'positional_embed_dim': 66, 'spatial_embed_dim': 100, 'node_dim': 256, 'edge_dim': 128, 'inner_dim': 128, 'depth': 6, 'dropout': 0.1, 'cut_off': 20.0, 'normalize': True}
  0%|                                                                                                                                                                                                                                                                                      | 0/40 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/mnt/storage/DFMDock/src/inference_single.py", line 532, in <module>
    main(args)
  File "/mnt/storage/DFMDock/src/inference_single.py", line 432, in main
    rec_pos, lig_pos, rot_update, tr_update, energy, num_clashes = Euler_Maruyama_sampler(
  File "/mnt/storage/DFMDock/src/inference_single.py", line 353, in Euler_Maruyama_sampler
    output = model(batch) 
  File "/home/ubuntu/anaconda3/envs/DFMDock/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/DFMDock/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/storage/DFMDock/src/models/score_model.py", line 52, in forward
    outputs = self.net(batch, predict=True)
  File "/home/ubuntu/anaconda3/envs/DFMDock/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/DFMDock/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/mnt/storage/DFMDock/src/models/score_net.py", line 361, in forward
    node = self.single_embed(x) # [n, c]
  File "/home/ubuntu/anaconda3/envs/DFMDock/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/DFMDock/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/DFMDock/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 117, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_mm)

however i manage to find fix for it and created PR for this https://github.com/Graylab/DFMDock/pull/2

lchu11 commented 1 week ago

thanks for running and fixing the code!

Graylab / DFMDock

Tensor error #1