issues
search
hijkzzz
/
Awesome-LLM-Strawberry
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
Apache License 2.0
5.04k
stars
283
forks
source link
📝 Added two 2023 papers and one 2024 paper
#8
Closed
esbenkc
closed
1 month ago
esbenkc
commented
1 month ago
2024
ARES - Alternating Reinforcement Learning and Supervised Fine-Tuningfor Enhanced Multi-Modal Chain-of-Thought ReasoningThrough Diverse AI Feedback
Ju-Seung Byun, Jiyun Chun, Jihyung Kil, Andrew Perrault
KTO - Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff, Dan Jurafsky, Douwe Kiela
2023
Training Chain-of-Thought via Latent-Variable Inference
Du Phan, Matthew D. Hoffman, David Dohan, Sholto Douglas, Tuan Anh Le, Aaron Parisi, Pavel Sountsov, Charles Sutton, Sharad Vikram, Rif A. Saurous
hijkzzz
commented
1 month ago
merged
2024
2023