bminixhofer / zett

Code for Zero-Shot Tokenizer Transfer
https://arxiv.org/abs/2405.07883
101 stars 7 forks source link

Issue running the tranfer script for Mistral - RAM OOM #9

Open elements72 opened 1 month ago

elements72 commented 1 month ago

Hi! I tried running the example script for Mistral as written in the repo. The scripts occupy all the available RAM, I tried to run both on Colab (12 GB of RAM) and Kaggle (29 GB of RAM) but in both cases, the process exceeded the available RAM. Instead, the example of xlm-roberta works fine


python3 scripts/transfer.py \
    --target_model=mistralai/Mistral-7B-v0.1 \
    --revision=refs/pr/95 \
    --tokenizer_name=EleutherAI/gpt-neox-20b \
    --output=my-new-fancy-mistral \
    --model_class=AutoModelForCausalLM \
    --checkpoint_path=zett-hypernetwork-Mistral-7B-v0.1 \
    --save_pt # otherwise saves only Flax weights```