opening-up-chatgpt / opening-up-chatgpt.github.io

Tracking instruction-tuned LLM openness. Paper: Liesenfeld, Andreas, Alianda Lopez, and Mark Dingemanse. 2023. “Opening up ChatGPT: Tracking Openness, Transparency, and Accountability in Instruction-Tuned Text Generators.” In Proceedings of the 5th International Conference on Conversational User Interfaces. doi:10.1145/3571884.3604316.
https://opening-up-chatgpt.github.io/
Apache License 2.0
80 stars 5 forks source link

Add UltraLM #63

Closed mdingemanse closed 8 months ago

mdingemanse commented 8 months ago

https://huggingface.co/openbmb/UltraRM-13b

We train and release a reward model UltraRM based on UltraFeedback to further facilitate alignment research. UltraRM is initialized by LLaMA2-13B.

Found via Alpaca eval leaderboard. Preprint of (synthetic) RLHF data: https://arxiv.org/abs/2310.01377