lightonai / RITA

RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.
MIT License
90 stars 8 forks source link

ESM's log_likelihood equivalent calculation for RITA? #6

Closed avilella closed 2 years ago

avilella commented 2 years ago

Hi, is there a way to calculate the equivalent of ESM's log_likelihood (protein stability, a.k.a fitness) for a protein sequence using this repository? Thanks in advance.

DanielHesslow commented 2 years ago

Thanks for the interest! Yup I think what you're looking for is provided in #8 see compute_fitness.py

avilella commented 2 years ago

Thanks, I managed to run the --help and get the prompt after installing transformers via pip3 on Ubuntu 22.04.

I also managed to git clone RITA_s from the huggingface URL.

What would be an example command-line to run compute_fitness.py?

This is how far I've got:

$ python3 compute_fitness.py --RITA_model_name_or_path RITA_s --RITA_tokenizer_name_or_path RITA_s/tokenizer.json 
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Traceback (most recent call last):
  File "/home/user/RITA/compute_fitness.py", line 105, in <module>
    main()
  File "/home/user/RITA/compute_fitness.py", line 76, in main
    model = AutoModelForCausalLM.from_pretrained(args.RITA_model_name_or_path,trust_remote_code=True)
  File "/home/user/miniconda3/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 440, in from_pretrained
    model_class = get_class_from_dynamic_module(
  File "/home/user/miniconda3/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 384, in get_class_from_dynamic_module
    return get_class_in_module(class_name, final_module.replace(".py", ""))
  File "/home/user/miniconda3/lib/python3.9/site-packages/transformers/dynamic_module_utils.py", line 154, in get_class_in_module
    module = importlib.import_module(module_path)
  File "/home/user/miniconda3/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'transformers_modules.local.rita_modeling'
(base) user@LS4:~/RITA$ #python3 compute_fitness.py --RITA_model_name_or_path RITA_s --RITA_tokenizer_name_or_path RITA_s/tokenizer.json 
(base) user@LS4:~/RITA$ less README.md 
(base) user@LS4:~/RITA$ ls -lrt
total 24
drwxrwxr-x 2 user user 4096 Jul 15 08:40 _static
-rw-rw-r-- 1 user user 2081 Jul 15 08:40 README.md
-rw-rw-r-- 1 user user  560 Jul 15 08:40 example.py
-rw-rw-r-- 1 user user 5416 Jul 15 08:40 compute_fitness.py
drwxrwxr-x 3 user user 4096 Jul 15 09:04 RITA_s