Acellera / acegen-open

Language models for drug discovery using torchrl
MIT License
70 stars 11 forks source link

NameError: name 'LlamaConfig' is not defined #40

Closed wenchangzhou-qtx closed 4 months ago

wenchangzhou-qtx commented 4 months ago

Hey @MorganCThomas,

Thanks for all the updates especially adding llama2 model there! I tried but got the below error, can you check when you get a chance? I have transformers library installed btw. Thanks!

Traceback (most recent call last):
  File "/home/softwares/acegen-open_latest/scripts/reinvent/reinvent.py", line 118, in main
    run_reinvent(cfg, task)
  File "/home/softwares/acegen-open_latest/scripts/reinvent/reinvent.py", line 175, in run_reinvent
    actor_training, actor_inference = create_actor(vocabulary_size=len(vocabulary))
  File "/home/softwares/mambaforge3/envs/acegen_latest/lib/python3.10/site-packages/acegen/models/llama2.py", line 101, in create_llama2_actor
    config = define_llama2_configuration(
  File "/home/softwares/mambaforge3/envs/acegen_latest/lib/python3.10/site-packages/acegen/models/llama2.py", line 76, in define_llama2_configuration
    config = LlamaConfig()
NameError: name 'LlamaConfig' is not defined

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Monitor may not have opened/closed properly: 'bool' object has no attribute 'pid'
[2024-06-18 10:55:23,500][molscore][ERROR] - Monitor may not have opened/closed properly: 'bool' object has no attribute 'pid'
Server killed
[2024-06-18 10:55:23,500][base][INFO] - Server killed
Calculating summary metrics
Exception ignored in atexit callback: <bound method MolScoreBenchmark._summarize of <molscore.manager.MolScoreBenchmark object at 0x7ff4ce4cfb50>>
Traceback (most recent call last):
  File "/home/softwares/mambaforge3/envs/acegen_latest/lib/python3.10/site-packages/molscore/manager.py", line 1237, in _summarize
    results = self.summarize()
  File "/home/softwares/mambaforge3/envs/acegen_latest/lib/python3.10/site-packages/molscore/manager.py", line 1205, in summarize
    print(f"Skipping summary of {MS.configs['task']} as no results found")
AttributeError: 'MolScore' object has no attribute 'configs'
MorganCThomas commented 4 months ago

This seems like it's still an environment issue. I suspect you might need a more recent version of the transformers library. Which version do you have installed? You can check with something like: pip list | grep transformers

What happens when you run the following line with you environment loaded? python -c "from transformers import LlamaConfig"

wenchangzhou-qtx commented 4 months ago

I have transformers 4.24.0 and from the python command I got

Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: cannot import name 'LlamaConfig' from 'transformers' (/home/softwares/mambaforge3/envs/acegen_latest/lib/python3.10/site-packages/transformers/__init__.py)

I can install the latest one to try, what version are you suggesting?

MorganCThomas commented 4 months ago

It seems Llama was integrated in version 4.28.0. So at least this model, I will make sure to add a more informative message now we know what version is required.

@11carlesnavarro which version of transformers should be used for Llama2?

wenchangzhou-qtx commented 4 months ago

I tried transforms 4.28.0, I can import LlamaConfig from transformers but still get error

Error executing job with overrides: []
Traceback (most recent call last):
  File "/home/softwares/acegen-open_latest/scripts/reinvent/reinvent.py", line 105, in main
    run_reinvent(cfg, task)
  File "/home/softwares/acegen-open_latest/scripts/reinvent/reinvent.py", line 177, in run_reinvent
    adapt_state_dict(ckpt, actor_inference.state_dict())
  File "/home/softwares/mambaforge3/envs/acegen_latest/lib/python3.10/site-packages/acegen/models/utils.py", line 18, in adapt_state_dict
    raise ValueError(
ValueError: The source and target state dicts don't have the same number of parameters.
albertbou92 commented 4 months ago

Hello!

I had the same problem when testing the code. Try installing the latest version of the transformers lib via pip install git+https://github.com/huggingface/transformers

Eventually, with the next version release of transformers should be solved, do it like this for now. This should install version 4.42.0.dev0, which works fine

wenchangzhou-qtx commented 4 months ago

Hey @MorganCThomas,

On a related topic but regarding molscore, seems the syntax for including molscore changed. Another observation is, when using single mode with my own json file as the molscore_task, the speed is much slower than the provided which is benchmark plus the LibINVENT_Exp1 as the molscore_task, is this normal and what mode for molscore_mode are you suggesting to run for my own designs? Thanks!

MorganCThomas commented 4 months ago

Yes we changed the syntax to make it cleaner for the latest version of MolScore which now includes curriculum learning.

To answer that I would need to know what your own json file configures, the benchmark mode is just an iterator over single mode so it has no reason to be faster.

wenchangzhou-qtx commented 4 months ago

I might be confused by the spped between different models e.g, gru vsv llama2 my bad.

Regarding the molscore_mode single, benchmark and curriculum, seems I can only run single with my own json file, with benchmark or curriculum, I need to have something there in presets , maybe here MolScore/molscore/configs/? This is the error i got AssertionError: Preset /home/test.json not found in presets

MorganCThomas commented 4 months ago

Yes that is exactly how it is designed. If you are running your own custom objective use single mode. Set molscore_task: to your config path.

Alternatively, if you have multiple objectives you want to run "back to back" like a benchmark, you can pass the directory containing to the parameter custom_benchmark: (Note you should leave molscore_task: null here). And of course switch to molscore_mode: benchmark. Note you could run only one individual task from a benchmark group of tasks by adding it to include: [my_task].

There is no need to put anything in MolScore/molscore/configs.

See the tutorials here

wenchangzhou-qtx commented 4 months ago

Got it, thanks for explaining!

MorganCThomas commented 4 months ago

Improved error messages in #41