Open imerelli opened 11 months ago
Hello, Is this problem persisting for this particular pdb file or every other?
Also, Are you trying to load a different tensor file to use for the prediction? Because I am a bit confused about where you are getting the tensor.pt file from. Apologies for not understanding your question clearly.
The problem persists for all the pdb files I tried. The fact is that I have no idea of what I have to put in this line checkpoint = torch.load(MODEL_CHECKPOINT_PATH) and, if I have to create the file tensors.pt, how I can do that. I'm using a server with an A100 GPU.
Hello, You do not need to change that line, MODEL_CHECKPOINT_PATH is a variable containing the path to the saved model weights, in this instance the Saved_Model directory. If you check the Saved_Model directory, it contains a file called model.ckpt, which is the saved model weights. There is no need to create any tensor.pt file.
Please let me know if you have further questions.
Thanks and regards.
Hello, using the original script with this line checkpoint = torch.load(MODEL_CHECKPOINT_PATH) gives this error:
$ python src/prediction.py examples/8b0s.pdb C_144_A A predictions
/opt/tools/deg/miniforge3/envs/PreMut/lib/python3.10/site-packages/pytorch_lightning/utilities/parsing.py:262: UserWarning: Attribute 'MODEL' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['MODEL'])`.
rank_zero_warn(
Traceback (most recent call last):
File "/opt/tools/deg/PreMut/src/prediction.py", line 154, in <module>
prediction.predict()
File "/opt/tools/deg/PreMut/src/prediction.py", line 114, in predict
checkpoint = torch.load(MODEL_CHECKPOINT_PATH)
File "/opt/tools/deg/miniforge3/envs/PreMut/lib/python3.10/site-packages/torch/serialization.py", line 712, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/opt/tools/deg/miniforge3/envs/PreMut/lib/python3.10/site-packages/torch/serialization.py", line 1049, in _load
result = unpickler.load()
File "/opt/tools/deg/miniforge3/envs/PreMut/lib/python3.10/pickle.py", line 1213, in load
dispatch[key[0]](self)
File "/opt/tools/deg/miniforge3/envs/PreMut/lib/python3.10/pickle.py", line 1254, in load_binpersid
self.append(self.persistent_load(pid))
File "/opt/tools/deg/miniforge3/envs/PreMut/lib/python3.10/site-packages/torch/serialization.py", line 1019, in persistent_load
load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "/opt/tools/deg/miniforge3/envs/PreMut/lib/python3.10/site-packages/torch/serialization.py", line 1001, in load_tensor
wrap_storage=restore_location(storage, location),
File "/opt/tools/deg/miniforge3/envs/PreMut/lib/python3.10/site-packages/torch/serialization.py", line 175, in default_restore_location
result = fn(storage, location)
File "/opt/tools/deg/miniforge3/envs/PreMut/lib/python3.10/site-packages/torch/serialization.py", line 152, in _cuda_deserialize
device = validate_cuda_device(location)
File "/opt/tools/deg/miniforge3/envs/PreMut/lib/python3.10/site-packages/torch/serialization.py", line 143, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on CUDA device '
RuntimeError: Attempting to deserialize object on CUDA device 1 but torch.cuda.device_count() is 1. Please use torch.load with map_location to map your storages to an existing device.
This is the reason why I tried to modify it. I have this error with all the pdb I tried, also with the example in the documentation. The gpu is there:
$ nvidia-smi
Sat Dec 9 09:38:15 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.82.01 Driver Version: 470.82.01 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A100-PCI... Off | 00000000:3B:00.0 Off | 0 |
| N/A 49C P0 68W / 300W | 0MiB / 80994MiB | 3% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Hello, I have done a new commit which should address the CUDA device issue. Could you do a git pull or a new git clone and try again? Please let me know if it works.
Hi,
Have you guys figured this out? I tried the PreMut today, and I got the CUDA issue. how could I fix it?
Thanks so much!
Hello
Hi,
Have you guys figured this out? I tried the PreMut today, and I got the CUDA issue. how could I fix it?
Thanks so much!
Hello, Could you try a git pull and try again? I have fixed the problem.
let me try.
It works, I got the predict .pdb result. But I have a naive question, how could I get the structure image of it using this .pdb file?
Hi, I have problems with the definition of the tensors.pt file and the relative map.
I see that in the documentation I should do something like this
But it is not working
Can you help me?