keras-team / keras-hub

Pretrained model hub for Keras 3
Apache License 2.0
763 stars 230 forks source link

Model weights contributions? #1463

Open deep-diver opened 7 months ago

deep-diver commented 7 months ago

There is a model called SOLAR. This model follows the same architecture as LLaMA2, but it has more layers which make it outstanding performer better than Mistral and even Mixtral at some points (open LLM Leaderboard)

In this case, what could be the contribution points?

innat commented 7 months ago

Is it? https://arxiv.org/abs/2312.15166

deep-diver commented 7 months ago

@innat yes, that is the one. The model arch is based on LLaMA2, but it is basically a sort of mixture of LLaMA2 and Mistral.

mattdangerw commented 6 months ago

@tirthasheshpatel is working on finishing up our llama2 implementation. Once it is ready, we could probably just extend our conversion script an d add this as a variant for llama2?

deep-diver commented 6 months ago

@mattdangerw thanks! Looks like the conversion script is already ready.

Also wondering if you guys accept contributions on the reverse conversion script (keras to HF)

hunkim commented 6 months ago

@deep-diver thanks!