📝 Added two 2023 papers and one 2024 paper - Githubissues

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.

Apache License 2.0

5.04k stars 283 forks source link

📝 Added two 2023 papers and one 2024 paper #8

Closed esbenkc closed 1 month ago

esbenkc commented 1 month ago

2024

ARES - Alternating Reinforcement Learning and Supervised Fine-Tuningfor Enhanced Multi-Modal Chain-of-Thought ReasoningThrough Diverse AI Feedback
- Ju-Seung Byun, Jiyun Chun, Jihyung Kil, Andrew Perrault
KTO - Model Alignment as Prospect Theoretic Optimization
- Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff, Dan Jurafsky, Douwe Kiela

2023

Training Chain-of-Thought via Latent-Variable Inference
- Du Phan, Matthew D. Hoffman, David Dohan, Sholto Douglas, Tuan Anh Le, Aaron Parisi, Pavel Sountsov, Charles Sutton, Sharad Vikram, Rif A. Saurous

hijkzzz commented 1 month ago

merged