huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
128.29k stars 25.45k forks source link

Fix llama gguf converter #31575

Closed SunMarc closed 2 days ago

SunMarc commented 4 days ago

What does this PR do?

This PR fixes the failing gguf tests caused by this PR. Instead of using the converter methods from SpmConverter class for GGUFLlamaConverter , we copy it + remove the parts that are using pieces and trainer_spec.control_symbols attributes from self.proto that comes from the sentencepiece (https://github.com/google/sentencepiece/blob/6225e08edb2577757163b3f5dbba4c0b670ef445/src/sentencepiece_model.proto#L299C29-L299C33). GGUFTokenizerSkeleton don't have these attributes and I don't think we should add them.

Fixes https://github.com/huggingface/transformers/issues/31553

HuggingFaceDocBuilderDev commented 4 days ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.