关于CoH的实现 - Githubissues

agi-templar / Stable-Alignment

Multi-agent Social Simulation + Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Language Models in Simulated Human Society".

https://arxiv.org/pdf/2305.16960.pdf

Other

336 stars 18 forks source link

关于CoH的实现 #7

Open Guochry opened 1 year ago

Guochry commented 1 year ago

作者您好！想请教下实验部分中的CoH基线的实现细节。因为看到CoH的论文中，损失函数部分中还加入了在预训练语料上的损失，想请问您在复现过程中，这部分预训练语料是选取的哪部分呢？万分感谢您的回复！