Closed Kowsher closed 7 months ago
Did you use the recommended version of transformers
? See requirements.txt
Did you use the recommended version of
transformers
? Seerequirements.txt
Yes, i followed it, but it works in tokenizer, facing issues in the fast tokenizer.
Unfortunately, our script is only tested under non-fast tokenizer merging.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.
Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.
That's true. I also ran into the same problem. And I think that's because LlamaTokenizerFast is not the same as LlamaTokenizer. The former is just a bpe-tokenizer(btw it is easy to load with a tokenizer.json trained by the library of huggingface/tokenizers), while the latter is equipped more with sp_model. It can be seen that LlamaTokenizerFast has no object named 'sp_model' till now.(https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/tokenization_llama_fast.py)
facing similar issue tried the recommended version, but still issue persists whats the fix for this ?
Check before submitting issues
Type of Issue
Model conversion and merging
Base Model
LLaMA-7B
Operating System
Linux
Describe your issue in detail
When I'm going to Marge tokenizer, I get this error 'AttributeError: 'LlamaTokenizerFast' object has no attribute 'sp_model''
llama_spm = sp_pb2_model.ModelProto() llama_spm.ParseFromString(llama_tokenizer.sp_model.serialized_model_proto())
Dependencies (must be provided for code-related issues)
Execution logs or screenshots