zjunlp / EasyEdit

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
https://zjunlp.github.io/project/KnowEdit
MIT License
1.74k stars 210 forks source link

llama3 in run_knowedit_llama2.py #275

Closed kailinjiang closed 3 months ago

kailinjiang commented 3 months ago

Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s] Loading checkpoint shards: 25%|██▌ | 1/4 [00:00<00:02, 1.18it/s] Loading checkpoint shards: 50%|█████ | 2/4 [00:01<00:01, 1.23it/s] Loading checkpoint shards: 75%|███████▌ | 3/4 [00:02<00:00, 1.20it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:02<00:00, 1.52it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:02<00:00, 1.39it/s] The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. The tokenizer class you load from this checkpoint is 'PreTrainedTokenizerFast'. The class this function is called from is 'LlamaTokenizer'. You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 Traceback (most recent call last): File "/scratch2/mas/jiangkailin/EasyEdit-main/examples/run_knowedit_llama2.py", line 207, in editor = BaseEditor.from_hparams(hparams) File "/scratch2/mas/jiangkailin/EasyEdit-main/../EasyEdit-main/easyeditor/editors/editor.py", line 59, in from_hparams return cls(hparams) File "/scratch2/mas/jiangkailin/EasyEdit-main/../EasyEdit-main/easyeditor/editors/editor.py", line 89, in init self.tok = LlamaTokenizer.from_pretrained(self.model_name) File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2089, in from_pretrained return cls._from_pretrained( File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2311, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/tokenization_llama.py", line 169, in init self.sp_model = self.get_spm_processor(kwargs.pop("from_slow", False)) File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/tokenization_llama.py", line 196, in get_spm_processor tokenizer.Load(self.vocab_file) File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/sentencepiece/init.py", line 961, in Load return self.LoadFromFile(model_file) File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/sentencepiece/init.py", line 316, in LoadFromFile return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg) TypeError: not a string

XeeKee commented 3 months ago

Thank you very much for your interest in EasyEdit. We will investigate and fix this issue shortly.