junhwi / next-gen-ai

0 stars 0 forks source link

24/05/26 #26

Open junhwi opened 1 month ago

junhwi commented 1 month ago

Layer-Condensed KV Cache for Efficient Inference of Large Language Models

https://arxiv.org/abs/2405.10637

Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet

https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

https://arxiv.org/abs/2405.12130

DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

https://arxiv.org/abs/2405.14333

shylee2021 commented 1 month ago

Cohere Aya 23 https://cohere.com/blog/aya23

Reducing Transformer Key-Value Cache Size with Cross-Layer Attention https://arxiv.org/abs/2405.12981

seyong92 commented 1 month ago

Scarlett Johansson told OpenAI not to use her voice — and it did anyway https://x.com/verge/status/1792689427585646859

NumPy 2.0 https://numpy.org/devdocs/release/2.0.0-notes.html

Images that Sound: Composing Images and Sounds on a Single Canvas https://arxiv.org/abs/2405.12221

Whisper large model with songs image

image