junhwi / next-gen-ai

0 stars 0 forks source link

24/03/31 #18

Open junhwi opened 6 months ago

junhwi commented 6 months ago

https://www.ai21.com/blog/announcing-jamba

http://qwenlm.github.io/blog/qwen-moe/

https://x.ai/blog/grok-1.5

https://openai.com/blog/navigating-the-challenges-and-opportunities-of-synthetic-voices

LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression https://arxiv.org/abs/2403.12968

AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System https://arxiv.org/abs/2402.15538

seyong92 commented 6 months ago

Street fighter 3 with LLM https://github.com/OpenGenerativeAI/llm-colosseum?fbclid=IwAR3rQzwKfyLjSo06N_XDCflHZ21eckx4OxrXGP1cB22NDaYXSM8L8AV1STI_aem_Ac9nBs21hzTq0RZsWZWspLoc0cPmya3hEJMCQIHORDe1IbCIv4nfFSqRClOcIHM-kQI

Supertone Shift https://product.supertone.ai/shift

NaturalSpeech3 https://arxiv.org/abs/2403.03100

shylee2021 commented 6 months ago

Databricks releases DRBX, the new SOTA LLM (they insist) https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm https://github.com/databricks/dbrx https://huggingface.co/databricks/dbrx-base https://twitter.com/danielhanchen/status/1772981050530316467

Is Fine-Tuning Still Valuable? https://hamel.dev/blog/posts/fine_tuning_valuable.html

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning https://arxiv.org/abs/2403.17919

The Unreasonable Ineffectiveness of the Deeper Layers https://arxiv.org/abs/2403.17887

howon52 commented 6 months ago

Claude 3 surpasses GPT-4 on Chatbot Arena https://arstechnica.com/information-technology/2024/03/the-king-is-dead-claude-3-surpasses-gpt-4-on-chatbot-arena-for-the-first-time/

DoRA a new, better, and faster LoRA? https://x.com/_philschmid/status/1773991507306963269?s=20 https://arxiv.org/abs/2402.09353