Open hanxiaotian opened 7 months ago
Hello @hanxiaotian yes there is a small bug in TRL's SFTTrainer
with how the training steps are counted and is being fixed here: https://github.com/huggingface/trl/pull/979
Another quick question, after concatenate tokens from different samples seperated by "eos" token, the loss are calculated over the whole sequence without any mask, does my understanding correct? Thanks!
So the fix is merged, but there is no release yet, and when there will be, the requirements should be update to new version of TRL
Current training uses ConstantLengthDataset. This dataset return fixed length of tokens (2048) in every step, however, the total number of steps are calculated based on the number of samples. I checked some samples and found that quite a few of them are much longer than 2048 (~7000), this means that some of the samples have never been seen in one epoch of training.
Could you please verify if my understanding is correct?
Thanks, appreciate.