huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences
https://huggingface.co/HuggingFaceH4
Apache License 2.0
4.18k stars 354 forks source link

Early Stopping Issue when used with ConstantLengthDataset #133

Open sankydesai opened 3 months ago

sankydesai commented 3 months ago

Hello I modified the code to include the Constant Length Dataset and it's early stopping at around 15% of the training. This issue doesn't occur when not used with the normal code given. Is there an issue with constant length dataset? I used it with SFTTrainer.