programmablebio / ptm-mamba

Other
35 stars 2 forks source link

python dependency versions #1

Open druyang opened 7 months ago

druyang commented 7 months ago

Hello,

This project seems very neat! I've been running into some version issues between Cuda and triton. Mamba_ssm also seems to be a dependency that is not included in the instructions, as following the instructions yields (even after running cd protein_lm/modeling/models/libs/ && pip install -e causal-conv1d && pip install -e mamba && cd ../../../../:

File "/notebooks/test.py", line 1, in <module>
  from protein_lm.modeling.scripts.infer import PTMMamba
File "/notebooks/protein_lm/modeling/scripts/infer.py", line 5, in <module>
  from protein_lm.modeling.scripts.train import compute_esm_embedding, load_ckpt, make_esm_input_ids
File "/notebooks/protein_lm/modeling/scripts/train.py", line 26, in <module>
  from protein_lm.modeling.models.mamba.lm import MambaLMHeadModel
File "/notebooks/protein_lm/modeling/models/mamba/lm.py", line 11, in <module>
  from mamba_ssm.modules.mamba_simple import Mamba, Block
ModuleNotFoundError: No module named 'mamba_ssm'

I wonder if perhaps these issues are related to versioning of the packages. Would it be possible to provide a requirements.txt or environment.yml from a working environment?

This would be helpful to understand Cuda and pytorch version compatibility as well.

Also when loading in the best.ckpt file, it appears to not match the layers in the model:

  Traceback (most recent call last):
    File "/notebooks/test.py", line 4, in <module>
      mamba = PTMMamba(ckpt_path,device='cuda:0')
    File "/notebooks/protein_lm/modeling/scripts/infer.py", line 15, in __init__
      self._model = load_ckpt(ckpt_path, self.tokenizer, device)
    File "/notebooks/protein_lm/modeling/scripts/train.py", line 149, in load_ckpt
      msg = model.load_state_dict(model_state_dict, strict=True)
    File "/usr/local/lib/python3.9/dist-packages/torch/nn/modules/module.py", line 1604, in load_state_dict
      raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
  RuntimeError: Error(s) in loading state_dict for MambaLMHeadModel:
          Unexpected key(s) in state_dict: "backbone.forward_layers.0.mixer.A_b_log", "`backbone.forward_layers.0.mixer.D_b", "backbone.forward_layers.0.mixer.conv1d_b.weight", "backbone.forward_layers.0.mixer.conv1d_b.bias", "backbone.forward_layers.0.mixer.x_proj_b.weight", "backbone.forward_layers.0.mixer.dt_proj_b.weight", "backbone.forward_layers.0.mixer.dt_proj_b.bias", "backbone.forward_layers.1.mixer.A_b_log", "backbone.forward_layers.1.mixer.D_b", "backbone.forward_layers.1.mixer.conv1d_b.weight", "backbone.forward_layers.1.mixer.conv1d_b.bias", "backbone.forward_layers.1.mixer.x_proj_b.weight", "backbone.forward_layers.1.mixer.dt_proj_b.weight", "backbone.forward_layers.1.mixer.dt_proj_b.bias", "backbone.forward_layers.2.mixer.A_b_log", "backbone.forward_layers.2.mixer.D_b", "backbone.forward_layers.2.mixer.conv1d_b.weight", "backbone.forward_layers.2.mixer.conv1d_b.bias", "backbone.forward_layers.2.mixer.x_proj_b.weight", "backbone.forward_layers.2.mixer.dt_proj_b.weight", "backbone.forward_layers.2.mixer.dt_proj_b.bias", "backbone.forward_layers.3.mixer.A_b_log", "backbone.forward_layers.3.mixer.D_b", "backbone.forward_layers.3.mixer.conv1d_b.weight", "backbone.forward_layers.3.mixer.conv1d_b.bias", "backbone.forward_layers.3.mixer.x_proj_b.weight", "backbone.forward_layers.3.mixer.dt_proj_b.weight", "backbone.forward_layers.3.mixer.dt_proj_b.bias", "backbone.forward_layers.4.mixer.A_b_log", "backbone.forward_layers.4.mixer.D_b", "backbone.forward_layers.4.mixer.conv1d_b.weight", "backbone.forward_layers.4.mixer.conv1d_b.bias", "backbone.forward_layers.4.mixer.x_proj_b.weight", "backbone.forward_layers.4.mixer.dt_proj_b.weight", "backbone.forward_layers.4.mixer.dt_proj_b.bias", "backbone.forward_layers.5.mixer.A_b_log", "backbone.forward_layers.5.mixer.D_b", "backbone.forward_layers.5.mixer.conv1d_b.weight", "backbone.forward_layers.5.mixer.conv1d_b.bias", "backbone.forward_layers.5.mixer.x_proj_b.weight", "backbone.forward_layers.5.mixer.dt_proj_b.weight", "backbone.forward_layers.5.mixer.dt_proj_b.bias", "backbone.forward_layers.6.mixer.A_b_log", "backbone.forward_layers.6.mixer.D_b", "backbone.forward_layers.6.mixer.conv1d_b.weight", "backbone.forward_layers.6.mixer.conv1d_b.bias", "backbone.forward_layers.6.mixer.x_proj_b.weight", "backbone.forward_layers.6.mixer.dt_proj_b.weight", "backbone.forward_layers.6.mixer.dt_proj_b.bias", "backbone.forward_layers.7.mixer.A_b_log", "backbone.forward_layers.7.mixer.D_b", "backbone.forward_layers.7.mixer.conv1d_b.weight", "backbone.forward_layers.7.mixer.conv1d_b.bias", "backbone.forward_layers.7.mixer.x_proj_b.weight", "backbone.forward_layers.7.mixer.dt_proj_b.weight", "backbone.forward_layers.7.mixer.dt_proj_b.bias", "backbone.forward_layers.8.mixer.A_b_log", "backbone.forward_layers.8.mixer.D_b", "backbone.forward_layers.8.mixer.conv1d_b.weight", "backbone.forward_layers.8.mixer.conv1d_b.bias", "backbone.forward_layers.8.mixer.x_proj_b.weight", "backbone.forward_layers.8.mixer.dt_proj_b.weight", "backbone.forward_layers.8.mixer.dt_proj_b.bias", "backbone.forward_layers.9.mixer.A_b_log", "backbone.forward_layers.9.mixer.D_b", "backbone.forward_layers.9.mixer.conv1d_b.weight", "backbone.forward_layers.9.mixer.conv1d_b.bias", "backbone.forward_layers.9.mixer.x_proj_b.weight", "backbone.forward_layers.9.mixer.dt_proj_b.weight", "backbone.forward_layers.9.mixer.dt_proj_b.bias", "backbone.forward_layers.10.mixer.A_b_log", "backbone.forward_layers.10.mixer.D_b", "backbone.forward_layers.10.mixer.conv1d_b.weight", "backbone.forward_layers.10.mixer.conv1d_b.bias", "backbone.forward_layers.10.mixer.x_proj_b.weight", "backbone.forward_layers.10.mixer.dt_proj_b.weight", "backbone.forward_layers.10.mixer.dt_proj_b.bias", "backbone.forward_layers.11.mixer.A_b_log", "backbone.forward_layers.11.mixer.D_b", "backbone.forward_layers.11.mixer.conv1d_b.weight", "backbone.forward_layers.11.mixer.conv1d_b.bias", "backbone.forward_layers.11.mixer.x_proj_b.weight", "backbone.forward_layers.11.mixer.dt_proj_b.weight", "backbone.forward_layers.11.mixer.dt_proj_b.bias", "backbone.forward_layers.12.mixer.A_b_log", "backbone.forward_layers.12.mixer.D_b", "backbone.forward_layers.12.mixer.conv1d_b.weight", "backbone.forward_layers.12.mixer.conv1d_b.bias", "backbone.forward_layers.12.mixer.x_proj_b.weight", "backbone.forward_layers.12.mixer.dt_proj_b.weight", "backbone.forward_layers.12.mixer.dt_proj_b.bias", "backbone.forward_layers.13.mixer.A_b_log", "backbone.forward_layers.13.mixer.D_b", "backbone.forward_layers.13.mixer.conv1d_b.weight", "backbone.forward_layers.13.mixer.conv1d_b.bias", "backbone.forward_layers.13.mixer.x_proj_b.weight", "backbone.forward_layers.13.mixer.dt_proj_b.weight", "backbone.forward_layers.13.mixer.dt_proj_b.bias", "backbone.forward_layers.14.mixer.A_b_log", "backbone.forward_layers.14.mixer.D_b", "backbone.forward_layers.14.mixer.conv1d_b.weight", "backbone.forward_layers.14.mixer.conv1d_b.bias", "backbone.forward_layers.14.mixer.x_proj_b.weight", "backbone.forward_layers.14.mixer.dt_proj_b.weight", "backbone.forward_layers.14.mixer.dt_proj_b.bias", "backbone.forward_layers.15.mixer.A_b_log", "backbone.forward_layers.15.mixer.D_b", "backbone.forward_layers.15.mixer.conv1d_b.weight", "backbone.forward_layers.15.mixer.conv1d_b.bias", "backbone.forward_layers.15.mixer.x_proj_b.weight", "backbone.forward_layers.15.mixer.dt_proj_b.weight", "backbone.forward_layers.15.mixer.dt_proj_b.bias", "backbone.forward_layers.16.mixer.A_b_log", "backbone.forward_layers.16.mixer.D_b", "backbone.forward_layers.16.mixer.conv1d_b.weight", "backbone.forward_layers.16.mixer.conv1d_b.bias", "backbone.forward_layers.16.mixer.x_proj_b.weight", "backbone.forward_layers.16.mixer.dt_proj_b.weight", "backbone.forward_layers.16.mixer.dt_proj_b.bias", "backbone.forward_layers.17.mixer.A_b_log", "backbone.forward_layers.17.mixer.D_b", "backbone.forward_layers.17.mixer.conv1d_b.weight", "backbone.forward_layers.17.mixer.conv1d_b.bias", "backbone.forward_layers.17.mixer.x_proj_b.weight", "backbone.forward_layers.17.mixer.dt_proj_b.weight", "backbone.forward_layers.17.mixer.dt_proj_b.bias", "backbone.forward_layers.18.mixer.A_b_log", "backbone.forward_layers.18.mixer.D_b", "backbone.forward_layers.18.mixer.conv1d_b.weight", "backbone.forward_layers.18.mixer.conv1d_b.bias", "backbone.forward_layers.18.mixer.x_proj_b.weight", "backbone.forward_layers.18.mixer.dt_proj_b.weight", "backbone.forward_layers.18.mixer.dt_proj_b.bias", "backbone.forward_layers.19.mixer.A_b_log", "backbone.forward_layers.19.mixer.D_b", "backbone.forward_layers.19.mixer.conv1d_b.weight", "backbone.forward_layers.19.mixer.conv1d_b.bias", "backbone.forward_layers.19.mixer.x_proj_b.weight", "backbone.forward_layers.19.mixer.dt_proj_b.weight", "backbone.forward_layers.19.mixer.dt_proj_b.bias", "backbone.forward_layers.20.mixer.A_b_log", "backbone.forward_layers.20.mixer.D_b", "backbone.forward_layers.20.mixer.conv1d_b.weight", "backbone.forward_layers.20.mixer.conv1d_b.bias", "backbone.forward_layers.20.mixer.x_proj_b.weight", "backbone.forward_layers.20.mixer.dt_proj_b.weight", "backbone.forward_layers.20.mixer.dt_proj_b.bias", "backbone.forward_layers.21.mixer.A_b_log", "backbone.forward_layers.21.mixer.D_b", "backbone.forward_layers.21.mixer.conv1d_b.weight", "backbone.forward_layers.21.mixer.conv1d_b.bias", "backbone.forward_layers.21.mixer.x_proj_b.weight", "backbone.forward_layers.21.mixer.dt_proj_b.weight", "backbone.forward_layers.21.mixer.dt_proj_b.bias", "backbone.forward_layers.22.mixer.A_b_log", "backbone.forward_layers.22.mixer.D_b", "backbone.forward_layers.22.mixer.conv1d_b.weight", "backbone.forward_layers.22.mixer.conv1d_b.bias", "backbone.forward_layers.22.mixer.x_proj_b.weight", "backbone.forward_layers.22.mixer.dt_proj_b.weight", "backbone.forward_layers.22.mixer.dt_proj_b.bias", "backbone.forward_layers.23.mixer.A_b_log", "backbone.forward_layers.23.mixer.D_b", "backbone.forward_layers.23.mixer.conv1d_b.weight", "backbone.forward_layers.23.mixer.conv1d_b.bias", "backbone.forward_layers.23.mixer.x_proj_b.weight", "backbone.forward_layers.23.mixer.dt_proj_b.weight", "backbone.forward_layers.23.mixer.dt_proj_b.bias", "backbone.backward_layers.0.mixer.A_b_log", "backbone.backward_layers.0.mixer.D_b", "backbone.backward_layers.0.mixer.conv1d_b.weight", "backbone.backward_layers.0.mixer.conv1d_b.bias", "backbone.backward_layers.0.mixer.x_proj_b.weight", "backbone.backward_layers.0.mixer.dt_proj_b.weight", "backbone.backward_layers.0.mixer.dt_proj_b.bias", "backbone.backward_layers.1.mixer.A_b_log", "backbone.backward_layers.1.mixer.D_b", "backbone.backward_layers.1.mixer.conv1d_b.weight", "backbone.backward_layers.1.mixer.conv1d_b.bias", "backbone.backward_layers.1.mixer.x_proj_b.weight", "backbone.backward_layers.1.mixer.dt_proj_b.weight", "backbone.backward_layers.1.mixer.dt_proj_b.bias", "backbone.backward_layers.2.mixer.A_b_log", "backbone.backward_layers.2.mixer.D_b", "backbone.backward_layers.2.mixer.conv1d_b.weight", "backbone.backward_layers.2.mixer.conv1d_b.bias", "backbone.backward_layers.2.mixer.x_proj_b.weight", "backbone.backward_layers.2.mixer.dt_proj_b.weight", "backbone.backward_layers.2.mixer.dt_proj_b.bias", "backbone.backward_layers.3.mixer.A_b_log", "backbone.backward_layers.3.mixer.D_b", "backbone.backward_layers.3.mixer.conv1d_b.weight", "backbone.backward_layers.3.mixer.conv1d_b.bias", "backbone.backward_layers.3.mixer.x_proj_b.weight", "backbone.backward_layers.3.mixer.dt_proj_b.weight", "backbone.backward_layers.3.mixer.dt_proj_b.bias", "backbone.backward_layers.4.mixer.A_b_log", "backbone.backward_layers.4.mixer.D_b", "backbone.backward_layers.4.mixer.conv1d_b.weight", "backbone.backward_layers.4.mixer.conv1d_b.bias", "backbone.backward_layers.4.mixer.x_proj_b.weight", "backbone.backward_layers.4.mixer.dt_proj_b.weight", "backbone.backward_layers.4.mixer.dt_proj_b.bias", "backbone.backward_layers.5.mixer.A_b_log", "backbone.backward_layers.5.mixer.D_b", "backbone.backward_layers.5.mixer.conv1d_b.weight", "backbone.backward_layers.5.mixer.conv1d_b.bias", "backbone.backward_layers.5.mixer.x_proj_b.weight", "backbone.backward_layers.5.mixer.dt_proj_b.weight", "backbone.backward_layers.5.mixer.dt_proj_b.bias", "backbone.backward_layers.6.mixer.A_b_log", "backbone.backward_layers.6.mixer.D_b", "backbone.backward_layers.6.mixer.conv1d_b.weight", "backbone.backward_layers.6.mixer.conv1d_b.bias", "backbone.backward_layers.6.mixer.x_proj_b.weight", "backbone.backward_layers.6.mixer.dt_proj_b.weight", "backbone.backward_layers.6.mixer.dt_proj_b.bias", "backbone.backward_layers.7.mixer.A_b_log", "backbone.backward_layers.7.mixer.D_b", "backbone.backward_layers.7.mixer.conv1d_b.weight", "backbone.backward_layers.7.mixer.conv1d_b.bias", "backbone.backward_layers.7.mixer.x_proj_b.weight", "backbone.backward_layers.7.mixer.dt_proj_b.weight", "backbone.backward_layers.7.mixer.dt_proj_b.bias", "backbone.backward_layers.8.mixer.A_b_log", "backbone.backward_layers.8.mixer.D_b", "backbone.backward_layers.8.mixer.conv1d_b.weight", "backbone.backward_layers.8.mixer.conv1d_b.bias", "backbone.backward_layers.8.mixer.x_proj_b.weight", "backbone.backward_layers.8.mixer.dt_proj_b.weight", "backbone.backward_layers.8.mixer.dt_proj_b.bias", "backbone.backward_layers.9.mixer.A_b_log", "backbone.backward_layers.9.mixer.D_b", "backbone.backward_layers.9.mixer.conv1d_b.weight", "backbone.backward_layers.9.mixer.conv1d_b.bias", "backbone.backward_layers.9.mixer.x_proj_b.weight", "backbone.backward_layers.9.mixer.dt_proj_b.weight", "backbone.backward_layers.9.mixer.dt_proj_b.bias", "backbone.backward_layers.10.mixer.A_b_log", "backbone.backward_layers.10.mixer.D_b", "backbone.backward_layers.10.mixer.conv1d_b.weight", "backbone.backward_layers.10.mixer.conv1d_b.bias", "backbone.backward_layers.10.mixer.x_proj_b.weight", "backbone.backward_layers.10.mixer.dt_proj_b.weight", "backbone.backward_layers.10.mixer.dt_proj_b.bias", "backbone.backward_layers.11.mixer.A_b_log", "backbone.backward_layers.11.mixer.D_b", "backbone.backward_layers.11.mixer.conv1d_b.weight", "backbone.backward_layers.11.mixer.conv1d_b.bias", "backbone.backward_layers.11.mixer.x_proj_b.weight", "backbone.backward_layers.11.mixer.dt_proj_b.weight", "backbone.backward_layers.11.mixer.dt_proj_b.bias", "backbone.backward_layers.12.mixer.A_b_log", "backbone.backward_layers.12.mixer.D_b", "backbone.backward_layers.12.mixer.conv1d_b.weight", "backbone.backward_layers.12.mixer.conv1d_b.bias", "backbone.backward_layers.12.mixer.x_proj_b.weight", "backbone.backward_layers.12.mixer.dt_proj_b.weight", "backbone.backward_layers.12.mixer.dt_proj_b.bias", "backbone.backward_layers.13.mixer.A_b_log", "backbone.backward_layers.13.mixer.D_b", "backbone.backward_layers.13.mixer.conv1d_b.weight", "backbone.backward_layers.13.mixer.conv1d_b.bias", "backbone.backward_layers.13.mixer.x_proj_b.weight", "backbone.backward_layers.13.mixer.dt_proj_b.weight", "backbone.backward_layers.13.mixer.dt_proj_b.bias", "backbone.backward_layers.14.mixer.A_b_log", "backbone.backward_layers.14.mixer.D_b", "backbone.backward_layers.14.mixer.conv1d_b.weight", "backbone.backward_layers.14.mixer.conv1d_b.bias", "backbone.backward_layers.14.mixer.x_proj_b.weight", "backbone.backward_layers.14.mixer.dt_proj_b.weight", "backbone.backward_layers.14.mixer.dt_proj_b.bias", "backbone.backward_layers.15.mixer.A_b_log", "backbone.backward_layers.15.mixer.D_b", "backbone.backward_layers.15.mixer.conv1d_b.weight", "backbone.backward_layers.15.mixer.conv1d_b.bias", "backbone.backward_layers.15.mixer.x_proj_b.weight", "backbone.backward_layers.15.mixer.dt_proj_b.weight", "backbone.backward_layers.15.mixer.dt_proj_b.bias", "backbone.backward_layers.16.mixer.A_b_log", "backbone.backward_layers.16.mixer.D_b", "backbone.backward_layers.16.mixer.conv1d_b.weight", "backbone.backward_layers.16.mixer.conv1d_b.bias", "backbone.backward_layers.16.mixer.x_proj_b.weight", "backbone.backward_layers.16.mixer.dt_proj_b.weight", "backbone.backward_layers.16.mixer.dt_proj_b.bias", "backbone.backward_layers.17.mixer.A_b_log", "backbone.backward_layers.17.mixer.D_b", "backbone.backward_layers.17.mixer.conv1d_b.weight", "backbone.backward_layers.17.mixer.conv1d_b.bias", "backbone.backward_layers.17.mixer.x_proj_b.weight", "backbone.backward_layers.17.mixer.dt_proj_b.weight", "backbone.backward_layers.17.mixer.dt_proj_b.bias", "backbone.backward_layers.18.mixer.A_b_log", "backbone.backward_layers.18.mixer.D_b", "backbone.backward_layers.18.mixer.conv1d_b.weight", "backbone.backward_layers.18.mixer.conv1d_b.bias", "backbone.backward_layers.18.mixer.x_proj_b.weight", "backbone.backward_layers.18.mixer.dt_proj_b.weight", "backbone.backward_layers.18.mixer.dt_proj_b.bias", "backbone.backward_layers.19.mixer.A_b_log", "backbone.backward_layers.19.mixer.D_b", "backbone.backward_layers.19.mixer.conv1d_b.weight", "backbone.backward_layers.19.mixer.conv1d_b.bias", "backbone.backward_layers.19.mixer.x_proj_b.weight", "backbone.backward_layers.19.mixer.dt_proj_b.weight", "backbone.backward_layers.19.mixer.dt_proj_b.bias", "backbone.backward_layers.20.mixer.A_b_log", "backbone.backward_layers.20.mixer.D_b", "backbone.backward_layers.20.mixer.conv1d_b.weight", "backbone.backward_layers.20.mixer.conv1d_b.bias", "backbone.backward_layers.20.mixer.x_proj_b.weight", "backbone.backward_layers.20.mixer.dt_proj_b.weight", "backbone.backward_layers.20.mixer.dt_proj_b.bias", "backbone.backward_layers.21.mixer.A_b_log", "backbone.backward_layers.21.mixer.D_b", "backbone.backward_layers.21.mixer.conv1d_b.weight", "backbone.backward_layers.21.mixer.conv1d_b.bias", "backbone.backward_layers.21.mixer.x_proj_b.weight", "backbone.backward_layers.21.mixer.dt_proj_b.weight", "backbone.backward_layers.21.mixer.dt_proj_b.bias", "backbone.backward_layers.22.mixer.A_b_log", "backbone.backward_layers.22.mixer.D_b", "backbone.backward_layers.22.mixer.conv1d_b.weight", "backbone.backward_layers.22.mixer.conv1d_b.bias", "backbone.backward_layers.22.mixer.x_proj_b.weight", "backbone.backward_layers.22.mixer.dt_proj_b.weight", "backbone.backward_layers.22.mixer.dt_proj_b.bias", "backbone.backward_layers.23.mixer.A_b_log", "backbone.backward_layers.23.mixer.D_b", "backbone.backward_layers.23.mixer.conv1d_b.weight", "backbone.backward_layers.23.mixer.conv1d_b.bias", "backbone.backward_layers.23.mixer.x_proj_b.weight", "backbone.backward_layers.23.mixer.dt_proj_b.weight", "backbone.backward_layers.23.mixer.dt_proj_b.bias". `

Edited for clarification

pengzhangzhi commented 6 months ago

Hi I am so sorry for the late response.

Did you use the docker container that I provided? That would solve the problem I believe. Because all the messy installation issues are related to the cuda and torch, where the container has all of them setup.