Closed redwoodzero0 closed 6 months ago
ExLlamaV2 supports Yi models, but not the custom YiTokenizer. To the extent that the tokenization can be done by either SentencePiece or the Tokenizers library it should still be okay, and I have had Yi models running seemingly fine (outside of TGW at least.) There are some reports that I still have to get to of possible tokenization issues.
The error you're seeing there is not a tokenizer error, though, it's likely because you're running with a pretty old version of ExLlamaV2. Newer versions will be able to recognize the architecture and look for the ln1
and ln2
tensors that Yi models have instead of input_layernorm
and post_attention_layernorm
as Llama calls them.
Cleaning up some stale issues.
When I try to quantize and run with Exl2, it won't run due to a YiTokenizer related error. Are there any plans for exl2 quantization and loader compatibility for models that use YiTokenizer?![Screenshot_20231207_033130_Chrome](https://github.com/turboderp/exllamav2/assets/150784533/e0621f2a-e7a6-41e7-ac36-deb4f5e92d4b)