EsYoon7 / RLHF-TLCR

[ACL'24 Findings] Official code for "TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback"
4 stars 0 forks source link

Problem about "GPT GENERATED MODIFICATED TRAIN/EVAL DATA" #1

Open Zeyuan-Liu opened 1 month ago

Zeyuan-Liu commented 1 month ago

During step 2: reward model fine-tuning, "GPT GENERATED MODIFICATED TRAIN/EVAL DATA" is required as the input. Could you please open-source your modificated data, or tell us the structure of the data file? Thanks a lot for your help!

EsYoon7 commented 3 weeks ago

Sorry for late reply. I can share the json file that I generated. Could you leave the gmail address?

Zeyuan-Liu commented 3 weeks ago

Thanks for your reply! My gmail address is zeyuan.liu01@gmail.com.