Open deep-diver opened 7 months ago
@innat yes, that is the one. The model arch is based on LLaMA2, but it is basically a sort of mixture of LLaMA2 and Mistral.
@tirthasheshpatel is working on finishing up our llama2 implementation. Once it is ready, we could probably just extend our conversion script an d add this as a variant for llama2?
@mattdangerw thanks! Looks like the conversion script is already ready.
Also wondering if you guys accept contributions on the reverse conversion script (keras to HF)
@deep-diver thanks!
There is a model called SOLAR. This model follows the same architecture as LLaMA2, but it has more layers which make it outstanding performer better than Mistral and even Mixtral at some points (open LLM Leaderboard)
In this case, what could be the contribution points?