It seems that you have implemented a StepDPOTrainer class which inherits from trl DPOTrainer, and you have implemented a function 'tokenize_row'. However, DPOTrainer does not have the 'tokenize_row' function, it belongs to 'OnlineDPOTrainer', so I wonder whether the StepDPOTrainer is really used in your training.
This may not affect your result. But I am curious about your 'tokenize_row' function, what does this function do? Maybe you want to use OnlineDPOTrainer?
Hi, thanks for your work! I have a small problem.
It seems that you have implemented a StepDPOTrainer class which inherits from trl DPOTrainer, and you have implemented a function 'tokenize_row'. However, DPOTrainer does not have the 'tokenize_row' function, it belongs to 'OnlineDPOTrainer', so I wonder whether the StepDPOTrainer is really used in your training.
This may not affect your result. But I am curious about your 'tokenize_row' function, what does this function do? Maybe you want to use OnlineDPOTrainer?
Best wishes!