junhwi / next-gen-ai

0 stars 0 forks source link

24/02/07 #11

Open junhwi opened 7 months ago

junhwi commented 7 months ago

DeepSeek

  1. Jan 5, 24 DeepSeek LLM https://github.com/deepseek-ai/DeepSeek-LLM image image
  2. Jan 11, 24 DeepSeekMoE https://github.com/deepseek-ai/DeepSeek-MoE image
  3. Jan 26, 24 DeepSeek-Coder https://github.com/deepseek-ai/DeepSeek-Coder image
  4. Feb 6, 24 DeepSeekMath7B https://github.com/deepseek-ai/DeepSeek-Math image

LLaVA-1.6

https://llava-vl.github.io/blog/2024-01-30-llava-1-6/

Coding on Copilot: 2023 Data Suggests Downward Pressure on Code Quality

Code churn -- the percentage of lines that are reverted or updated less than two weeks after being authored -- is projected to double in 2024 compared to its 2021, pre-AI baseline.

image

https://gitclear-public.s3.us-west-2.amazonaws.com/Coding-on-Copilot-2024-Developer-Research.pdf

SSM

Repeat After Me: Transformers are Better than State Space Models at Copying

https://arxiv.org/abs/2402.01032 https://huggingface.co/papers/2402.01032#65c12b0b5bf72d1811466dc0

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

https://arxiv.org/abs/2402.04248

Agent

PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models

https://github.com/git-disl/PokeLLMon https://arxiv.org/pdf/2402.01118.pdf

hippothewild commented 7 months ago
shylee2021 commented 7 months ago

OLMo: Accelerating the Science of Language Models https://arxiv.org/abs/2402.00838 Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research https://arxiv.org/abs/2402.00159

Can Large Language Models Understand Context? https://arxiv.org/abs/2402.00858

seyong92 commented 7 months ago

저희 서비스 업데이트가 막바지라 참여가 가능할지 모르겠는데... 일단 참여하더라도 이번 주는 듣는 걸로 가겠습니다..