AkihikoWatanabe / paper_notes

たまに追加される論文メモ
https://AkihikoWatanabe.github.io/paper_notes
15 stars 0 forks source link

Secrets of RLHF in Large Language Models Part I: PPO, Rui Zheng+, N/A, arXiv'23 #807

Open AkihikoWatanabe opened 1 year ago

AkihikoWatanabe commented 1 year ago

URL

AkihikoWatanabe commented 1 year ago

RLHFとPPOをの内部構造を調査したレポート。RLHFに興味がある場合は読むべし。

AkihikoWatanabe commented 1 year ago

github: https://github.com/OpenLMLab/MOSS-RLHF