junhwi / next-gen-ai

0 stars 0 forks source link

24/01/03 #6

Open junhwi opened 6 months ago

junhwi commented 6 months ago

SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling https://arxiv.org/abs/2312.15166

Fast Inference of Mixture-of-Experts Language Models with Offloading https://paperswithcode.com/paper/fast-inference-of-mixture-of-experts-language

https://openai.com/research/weak-to-strong-generalization

shylee2021 commented 6 months ago

SSM https://youtu.be/dKJEpOtVgXc

SOLAR https://arxiv.org/abs/2312.15166 https://chat.lmsys.org/ (refer to leaderboard) https://twitter.com/LChoshen/status/1739993589969564027

Model Merge https://github.com/cg123/mergekit https://arxiv.org/abs/2306.01708

hippothewild commented 6 months ago