princeton-nlp / SimPO

SimPO: Simple Preference Optimization with a Reference-Free Reward
MIT License
615 stars 36 forks source link

Could you add licenses to the preference datasets after reward model labeling on huggingface? #46

Closed hanyang1999 closed 1 month ago

hanyang1999 commented 1 month ago

Hi authors, Thank you for the nice works and release of the datasets. Could you add corresponding licenses to the released datasets, namely https://huggingface.co/datasets/princeton-nlp/llama3-ultrafeedback-armorm and https://huggingface.co/datasets/princeton-nlp/llama3-ultrafeedback? Correspondingly it has no licenses, and it will be greatly helpful to add like MIT license to let us officially use them.

hanyang1999 commented 1 month ago

By the way, it will be also greatly appreciated if you could release more details on the random seeds when you created the datasets for reproducibility, thanks!

yumeng5 commented 1 month ago

Thanks! We have added the MIT license. For the dataset creation, we used random seeds 13, 21, 42, 79, and 100 with a temperature of 0.8 for sampling to obtain 5 responses per prompt. I hope this helps!

Best, Yu

hanyang1999 commented 1 month ago

Thank you for the prompt replies! Closed the issues!