stanford-crfm / ecosystem-graphs

257 stars 35 forks source link

Add a asset #160

Open russellkim opened 6 months ago

russellkim commented 6 months ago

SOLAR - https://arxiv.org/abs/2312.15166

name: SOLAR organization: Upstage.ai description:

We present a methodology for scaling LLMs called depth up-scaling (DUS) , which encompasses architectural modifications and continued pretraining. In other words, we integrated Mistral 7B weights into the upscaled layers, and finally, continued pre-training for the entire model.

SOLAR-10.7B has remarkable performance. It outperforms models with up to 30B parameters, even surpassing the recent Mixtral 8X7B model. For detailed information, please refer to the experimental table. Solar 10.7B is an ideal choice for fine-tuning. SOLAR-10.7B offers robustness and adaptability for your fine-tuning needs. Our simple instruction fine-tuning using the SOLAR-10.7B pre-trained model yields significant performance improvements (SOLAR-10.7B-Instruct-v1.0).

created date: 2023 url: https://arxiv.org/abs/2312.15166 model card: https://huggingface.co/upstage/SOLAR-10.7B-v1.0 modality: text analysis: size: 10.7B

rishibommasani commented 6 months ago

Thanks, this looks great - could you add a PR @russellkim?

russellkim commented 6 months ago

@rishibommasani Thanks, please, review it. https://github.com/stanford-crfm/ecosystem-graphs/pull/167