Unable to load via huggingface

calvin-scio commented 1 year ago

Hi Open Llama authors! Thanks for your amazing contribution 😄 this is game changing. I've been trying to load this using huggingface via the usual model loader classes and it's failing though, could you advise on how to get it working?

from transformers import AutoModelForCausalLM
AutoModelForCausalLM.from_pretrained('openlm-research/open_llama_7b_preview_200bt')

  File "/Users/user/workspace/py310/lib/python3.10/site-packages/transformers/configuration_utils.py", line 573, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/Users/user/workspace/py310/lib/python3.10/site-packages/transformers/configuration_utils.py", line 628, in _get_config_dict
    resolved_config_file = cached_file(
  File "/Users/user/workspace/py310/lib/python3.10/site-packages/transformers/utils/hub.py", line 454, in cached_file
    raise EnvironmentError(
OSError: openlm-research/open_llama_7b_preview_200bt does not appear to have a file named config.json. Checkout 'https://huggingface.co/openlm-research/open_llama_7b_preview_200bt/main' for available files.

Is it because the actual config.json is nested one folder deeper in open_llama_7b_preview_200bt_transformers_weights rather than at the base? Is there a convenient way around this so it's compatible with HF? Thank you!

Same happens when I try with from transformers import LlamaModel btw.

young-geng commented 1 year ago

Yeah, it is nested inside the open_llama_7b_preview_200bt_transformers_weights folder because we also have JAX weights for our own EasyLM framework. BTW, for using it with the transformer framework, I strongly recommend using our new 300b checkpoint as that one is not so sensitive to the BOS token.

riversun commented 1 year ago

Thank you, authors, for your great efforts.

I also wanted to use it with HuggingFace's Transformer, so I downloaded the model locally as follows

git clone https://huggingface.co/openlm-research/open_llama_7b_preview_300bt

Then I loaded the model directly by specifying the directory "open_llama_7b_preview_300bt_transformers_weights"

model_path ="/home/user/sandbox/open_llama_7b_preview_300bt/open_llama_7b_preview_300bt_transformers_weights"

Here is the code I tried https://github.com/riversun/open_llama_7b_hands_on

I hope this helps!

calvin-scio commented 1 year ago

That looks great @riversun , worked for me as well!

Shamdan17 commented 1 year ago

One faster way is to just specify the subfolder argument in from_pretrained. Example for AutoModelForCausalLM:

from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("openlm-research/open_llama_7b_preview_300bt", subfolder="open_llama_7b_preview_300bt_transformers_weights")

openlm-research / open_llama

Unable to load via huggingface #8