hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
Apache License 2.0
5.04k stars 283 forks source link

Relevant papers to add #9

Closed ArcticBeat05 closed 1 month ago

ArcticBeat05 commented 1 month ago

https://arxiv.org/pdf/2312.02179 - Training Chain-of-Thought via Latent-Variable Inference By Google

https://arxiv.org/pdf/2402.05808 - Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

https://arxiv.org/pdf/2401.08967 - REFT: Reasoning with REinforced Fine-Tuning by ByteDance Research

hijkzzz commented 1 month ago

merged