huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences
https://huggingface.co/HuggingFaceH4
Apache License 2.0
4.53k stars 393 forks source link

🌟 #135

Closed lewtun closed 6 months ago

lewtun commented 6 months ago

Adds StarChat2 recipe, along with logic to decontaminate the datasets against HumanEval.

HuggingFaceDocBuilderDev commented 6 months ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BramVanroy commented 6 months ago

Hello @lewtun

This broke the cpt use-case because this line makes bold assumptions about the columns in a user's dataset.

https://github.com/huggingface/alignment-handbook/blob/595023faa401b8a0ea8338aac712a6f768ee9b34/src/alignment/data.py#L176

Even in DPO/SFT settings users may have different column names than the ones given here:

https://github.com/huggingface/alignment-handbook/blob/595023faa401b8a0ea8338aac712a6f768ee9b34/src/alignment/data.py#L26

Is there a better way to solve this by giving more control to the users? For cpt I'm working on a fix so that text_column from the dataconfig gets passed and is not removed but on top of that I feel like this is being very restrictive on a user's data layout.