OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
MIT License
626 stars 49 forks source link

[Model Request] upstage/SOLAR-10.7B-v1.0 #45

Closed joseph777111 closed 3 months ago

joseph777111 commented 7 months ago

SOLAR-10.7B, is a compact, yet remarkably powerful large language model; it has demonstrated unparalleled state-of-the-art performance in models under 30B parameters - rivaling model's with up to 30B parameters in performance.

SOLAR-10.7B, was developed using Upstage's Depth Up-Scaling. And, it was Built on the Llama2 architecture with integrated Mistral 7B weights integrated into its upscaled layers as part of its pre-training.

Upstage's Depth-Upscaled SOLAR-10.7B has remarkable performance. It outperforms models with up to 30B parameters, even surpassing the recent Mixtral 8X7B model. For detailed information, please refer to the experimental table. Solar 10.7B is an ideal choice for fine-tuning. SOLAR-10.7B offers robustness and adaptability for your fine-tuning needs. Our simple instruction fine-tuning using the SOLAR-10.7B pre-trained model yields significant performance improvements (SOLAR-10.7B-Instruct-v1.0).

https://huggingface.co/upstage/SOLAR-10.7B-v1.0

Screenshot 2023-12-20 at 10 04 06 AM
joseph777111 commented 7 months ago

I wasn't sure if anything specific had to be done for this one, since it is a hybrid model, which combines some of Mistral 7b's weights with LLAMA-2 13b. 😬