Closed Symbolk closed 1 year ago
I learn from the paper that "To evaluate data quality, we train a reward model based on OPT 1.3B (Iyer et al., 2022) to rate different responses.", can it be used as a replacement for GPT-4 at the rewarding task? Is it open-sourced?
Thanks for your interests. For now, we can not release any additional resources but we defenitely will release the model. Stay tuned.
I learn from the paper that "To evaluate data quality, we train a reward model based on OPT 1.3B (Iyer et al., 2022) to rate different responses.", can it be used as a replacement for GPT-4 at the rewarding task? Is it open-sourced?