Closed Nikita-Sherstnev closed 4 months ago
you'll need to change some more configuration params (e.g. n_local_heads should be 8)
I'd copy them from here https://huggingface.co/docs/transformers/main/model_doc/mistral#transformers.MistralConfig
Done in #116 The issue can be closed now.
Would it be hard to adapt this code for Mistral? I tried open orca version and set vocab_size in config to 32002. But shapes did not match: