bminixhofer / zett

Code for Zero-Shot Tokenizer Transfer
https://arxiv.org/abs/2405.07883
111 stars 8 forks source link

Missing flax_model.msgpack for TinyLlama #2

Closed LorrinWWW closed 4 months ago

LorrinWWW commented 4 months ago

I am trying to adapt TinyLlama to mistral tokenizer, and it shows:

Traceback (most recent call last):
  File "/home/jue/zett/scripts/transfer.py", line 92, in <module>
    load_params(args.target_model, revision=args.revision)
  File "/home/jue/zett/zett/utils.py", line 736, in load_params
    files = [cached_file(model_name_or_path, "flax_model.msgpack", **kwargs)]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jue/miniforge3/envs/zett/lib/python3.11/site-packages/transformers/utils/hub.py", line 453, in cached_file
    raise EnvironmentError(
OSError: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T does not appear to have a file named flax_model.msgpack. Checkout 'https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T/tree/main' for available files.

The command is:

python3 scripts/transfer.py \
    --target_model=TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T \
    --tokenizer_name=mistralai/Mistral-7B-Instruct-v0.1 \
    --output=my-new-tiny-llama \
    --model_class=AutoModelForCausalLM \
    --lang_code=en \
    --lang_path=lang.txt \
    --checkpoint_path=zett-hypernetwork-TinyLlama-1.1B-intermediate-step-1431k-3T \
    --save_pt
LorrinWWW commented 4 months ago

I realized there is a PR that added flax weights. So simply add --revision=refs/pr/8

bminixhofer commented 4 months ago

Closing this as solved, but I'll clarify this in the README!