uw-ipd / RoseTTAFold2NA

RoseTTAFold2 protein/nucleic acid complex prediction
MIT License
306 stars 67 forks source link

NVTX functions not installed #104

Closed SuhasSrinivasan closed 2 months ago

SuhasSrinivasan commented 2 months ago

When trying the resolution for the issue #99 through branch dimaio/new_config, I am coming across an old issue:

Running RoseTTAFold2NA to predict structures
 -> Running command: python /mnt/storage/rf2na-new/network/predict.py -inputs PR:/mnt/storage/rf2na-new/example/rna_pred/rna_binding_protein.RNA.a3m:/mnt/storage/rf2na-new/example/rna_pred/rna_binding_protein.hhr:/mnt/storage/rf2na-new/example/rna_pred/rna_binding_protein.atab -prefix /mnt/storage/rf2na-new/example/rna_pred/models/model -model /mnt/storage/rf2na-new/network/weights/RF2NA_apr23.pt -db /mnt/storage/rf2na-new/pdb100_2021Mar03/pdb100_2021Mar03
/mnt/storage/rf2na-new/network/util.py:230: UserWarning: Using torch.cross without specifying the dim arg is deprecated.
Please either pass the dim explicitly or simply use torch.linalg.cross.
The default value of dim will change to agree with that of linalg.cross in a future release. (Triggered internally at /opt/conda/conda-bld/pytorch_1711403233856/work/aten/src/ATen/native/Cross.cpp:63.)
  Z = torch.cross(Xn,Yn)
Running on CPU
           plddt    best
Traceback (most recent call last):
  File "/mnt/storage/rf2na-new/network/predict.py", line 377, in <module>
    pred.predict(inputs=args.inputs, out_prefix=args.prefix, ffdb=ffdb)
  File "/mnt/storage/rf2na-new/network/predict.py", line 250, in predict
    self._run_model(Ls, msa_orig, ins_orig, t1d, t2d, xyz_t, xyz_t[:,0], alpha_t, same_chain, mask_t_2d, "%s_%02d"%(out_prefix, i_trial))
  File "/mnt/storage/rf2na-new/network/predict.py", line 296, in _run_model
    logit_s, logit_aa_s, logit_pae, p_bind, init_crds, alpha_prev, _, pred_lddt_binned, msa_prev, pair_prev, state_prev = self.model(
.
.
.
  File "/miniconda3/envs/RF2NA/lib/python3.10/site-packages/torch/cuda/nvtx.py", line 12, in _fail
    raise RuntimeError(
RuntimeError: NVTX functions not installed. Are you sure you have a CUDA build?

There have beens similar long reported issues #33 and #36. I tried some of the workarounds of installing additional pytorch libraries, and most up-to-date versions for all.

Attached is the conda environment export. conda env RF2NA.txt

Ubuntu 22.04.4 LTS and GPU & CUDA versions below.

$ nvidia-smi
Tue Apr 16 21:40:24 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX A6000               Off |   00000000:01:00.0 Off |                  Off |
| 30%   32C    P8             22W /  300W |       1MiB /  49140MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
SuhasSrinivasan commented 2 months ago

There is something wrong with pytorch installation for dimaio/new_config.

$ python
Python 3.10.0 | packaged by conda-forge | (default, Nov 20 2021, 02:24:10) [GCC 9.4.0] on linux                                                                                                                                                                           Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
False
SuhasSrinivasan commented 2 months ago

Solution here #105