Closed lewtun closed 6 months ago
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
Hello @lewtun
This broke the cpt
use-case because this line makes bold assumptions about the columns in a user's dataset.
Even in DPO/SFT settings users may have different column names than the ones given here:
Is there a better way to solve this by giving more control to the users? For cpt
I'm working on a fix so that text_column
from the dataconfig gets passed and is not removed but on top of that I feel like this is being very restrictive on a user's data layout.
Adds StarChat2 recipe, along with logic to decontaminate the datasets against HumanEval.