Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s]
Loading checkpoint shards: 25%|██▌ | 1/4 [00:00<00:02, 1.18it/s]
Loading checkpoint shards: 50%|█████ | 2/4 [00:01<00:01, 1.23it/s]
Loading checkpoint shards: 75%|███████▌ | 3/4 [00:02<00:00, 1.20it/s]
Loading checkpoint shards: 100%|██████████| 4/4 [00:02<00:00, 1.52it/s]
Loading checkpoint shards: 100%|██████████| 4/4 [00:02<00:00, 1.39it/s]
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'PreTrainedTokenizerFast'.
The class this function is called from is 'LlamaTokenizer'.
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
Traceback (most recent call last):
File "/scratch2/mas/jiangkailin/EasyEdit-main/examples/run_knowedit_llama2.py", line 207, in
editor = BaseEditor.from_hparams(hparams)
File "/scratch2/mas/jiangkailin/EasyEdit-main/../EasyEdit-main/easyeditor/editors/editor.py", line 59, in from_hparams
return cls(hparams)
File "/scratch2/mas/jiangkailin/EasyEdit-main/../EasyEdit-main/easyeditor/editors/editor.py", line 89, in init
self.tok = LlamaTokenizer.from_pretrained(self.model_name)
File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2089, in from_pretrained
return cls._from_pretrained(
File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2311, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/tokenization_llama.py", line 169, in init
self.sp_model = self.get_spm_processor(kwargs.pop("from_slow", False))
File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/tokenization_llama.py", line 196, in get_spm_processor
tokenizer.Load(self.vocab_file)
File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/sentencepiece/init.py", line 961, in Load
return self.LoadFromFile(model_file)
File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/sentencepiece/init.py", line 316, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
TypeError: not a string
Loading checkpoint shards: 0%| | 0/4 [00:00<?, ?it/s] Loading checkpoint shards: 25%|██▌ | 1/4 [00:00<00:02, 1.18it/s] Loading checkpoint shards: 50%|█████ | 2/4 [00:01<00:01, 1.23it/s] Loading checkpoint shards: 75%|███████▌ | 3/4 [00:02<00:00, 1.20it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:02<00:00, 1.52it/s] Loading checkpoint shards: 100%|██████████| 4/4 [00:02<00:00, 1.39it/s] The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. The tokenizer class you load from this checkpoint is 'PreTrainedTokenizerFast'. The class this function is called from is 'LlamaTokenizer'. You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the
editor = BaseEditor.from_hparams(hparams)
File "/scratch2/mas/jiangkailin/EasyEdit-main/../EasyEdit-main/easyeditor/editors/editor.py", line 59, in from_hparams
return cls(hparams)
File "/scratch2/mas/jiangkailin/EasyEdit-main/../EasyEdit-main/easyeditor/editors/editor.py", line 89, in init
self.tok = LlamaTokenizer.from_pretrained(self.model_name)
File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2089, in from_pretrained
return cls._from_pretrained(
File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2311, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/tokenization_llama.py", line 169, in init
self.sp_model = self.get_spm_processor(kwargs.pop("from_slow", False))
File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/transformers/models/llama/tokenization_llama.py", line 196, in get_spm_processor
tokenizer.Load(self.vocab_file)
File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/sentencepiece/init.py", line 961, in Load
return self.LoadFromFile(model_file)
File "/home/jiangkailin/miniconda3/envs/EasyEdit/lib/python3.9/site-packages/sentencepiece/init.py", line 316, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
TypeError: not a string
legacy
(previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, setlegacy=False
. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 Traceback (most recent call last): File "/scratch2/mas/jiangkailin/EasyEdit-main/examples/run_knowedit_llama2.py", line 207, in