bminixhofer / zett

Code for Zero-Shot Tokenizer Transfer
https://arxiv.org/abs/2405.07883
101 stars 7 forks source link

Issue in running the transfer script #4

Open zaidalyafeai opened 1 month ago

zaidalyafeai commented 1 month ago

I tried running

python scripts/transfer.py \
    --target_model=FacebookAI/xlm-roberta-base \
    --tokenizer_name=gpt2 \
    --output=my-new-fancy-xlm-r \
    --model_class=AutoModelForMaskedLM \
    --lang_code=en \
    --checkpoint_path=zett-hypernetwork-xlm-roberta-base \
    --save_pt # otherwise saves only Flax weights

But I got

AttributeError: 'HyperRobertaConfig' object has no attribute 'language_adapter_bottleneck_dim'

bminixhofer commented 1 month ago

Sorry, this went wrong while cleaning up the codebase. Should be fixed now!

zaidalyafeai commented 1 month ago

Thank you @bminixhofer , it is now working. Still the other script doesn't work

scripts/transfer.py \
    --target_model=mistralai/Mistral-7B-v0.1 \
    --revision=refs/pr/95 \
    --tokenizer_name=EleutherAI/gpt-neox-20b \
    --output=my-new-fancy-mistral \
    --model_class=AutoModelForCausalLM \
    --checkpoint_path=zett-hypernetwork-Mistral-7B-v0.1 \
    --save_pt # otherwise saves only Flax weights

I get ModuleNotFoundError: No module named 'transformers_modules.zett-hypernetwork-Mistral-7B-v0'

zaidalyafeai commented 1 month ago

I tried to change the foulder name because there is a dot in the name, now getting this error

ValueError: Unrecognized configuration class <class 'transformers.models.mistral.configuration_mistral.MistralConfig'> for this kind of AutoModel: FlaxAutoModelForMaskedLM.
Model type should be one of AlbertConfig, BartConfig, BertConfig, BigBirdConfig, DistilBertConfig, ElectraConfig, MBartConfig, RobertaConfig, RobertaPreLayerNormConfig, RoFormerConfig, XLMRobertaConfig.
jubgjf commented 1 month ago

transformers==4.34.1 doesn't have modeling_flax_llama.py, upgrade to 4.41.1 solves this problem.

zaidalyafeai commented 1 month ago

Thanks @jubgjf for the suggestion, already tried multiple versions of transformers and still getting this error.

jubgjf commented 1 month ago

I created a new environment and rewrite requirements.txt:

h5py==3.8.0
transformers==4.41.1
accelerate==0.30.1
wandb==0.15.4
optax==0.1.5
flax==0.8.0
maturin==1.3.0
pandas==2.0.3
pyahocorasick==2.0.0
matplotlib==3.7.2
scikit-learn==1.4.2
datasets==2.19.1
scipy==1.10.1
jax[cuda12]==0.4.23
jaxlib[cuda12]==0.4.23
$ pip install -r requirements.txt

and install cuda toolkit with conda:

$ conda install nvidia/label/cuda-12.2.0::cuda

These steps work for me.

bminixhofer commented 1 month ago

@zaidalyafeai does this solution work for you?

If it does, I'll update the requirements.txt to match the one from @jubgjf (thanks!)

zaidalyafeai commented 1 month ago

I tried it @bminixhofer but it doesn't work. I am not using conda, so this might be the issue.

jubgjf commented 1 month ago

This environment may still incorrect, as I met issues when training a hypernetwork mentioned in https://github.com/bminixhofer/zett/issues/6#issue-2335398668

pjlintw commented 1 month ago

Upgrading to transformers==4.41.1 worked for me. Because starting from some versions, LLaMa architecture has been included for the Flax causal LMs class.

Here is the list of supported models.

pjlintw commented 1 month ago

Thank you @bminixhofer , it is now working. Still the other script doesn't work

scripts/transfer.py \
    --target_model=mistralai/Mistral-7B-v0.1 \
    --revision=refs/pr/95 \
    --tokenizer_name=EleutherAI/gpt-neox-20b \
    --output=my-new-fancy-mistral \
    --model_class=AutoModelForCausalLM \
    --checkpoint_path=zett-hypernetwork-Mistral-7B-v0.1 \
    --save_pt # otherwise saves only Flax weights

I get ModuleNotFoundError: No module named 'transformers_modules.zett-hypernetwork-Mistral-7B-v0'

I encountered an EnvironmentError while executing this code. The error seemed to be due to accessing the files without proper permission.

                           ^^^^^^^^^^^^
  File "/nethome/pinjie/anaconda3/envs/zett/lib/python3.11/site-packages/transformers/utils/hub.py", line 442, in cached_file
    raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like mistralai/Mistral-7B-v0.1 is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.

The error was resolved after I accepted the conditions to access its files and content on the model hub (https://huggingface.co/mistralai/Mistral-7B-v0.1).

Then, I created access tokens ($HUGGINGFACE_TOKEN) for the required permissions. Remember to add the repository to your access token settings. After logging in, it seems good now.

huggingface-cli login --token $HUGGINGFACE_TOKEN