argilla-io / notus

Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach
MIT License
161 stars 14 forks source link

Warn about data contamination of UltraFeedback and add Notus naming explanation #10

Closed alvarobartt closed 11 months ago

alvarobartt commented 11 months ago

Description

This PR warns readers about the data contamination issue within UltraFeedback, since as of the recent MistralAI reports on data contamination, AllenAI also reported some of those affecting the TruthfulQA benchmark within UltraFeedback, so the scores Zephyr and ourselves got for TruthfulQA are not correct / fair due to the contamination.

Besides that, also a short explanation has been included so that users know why Notus, and where does it comes from.