Closed sparsh35 closed 1 month ago
hi that's not supported at the moment, but i guess ill support that in next 24 hour
I am also thinking of implementing Online DPO trainer with EasyDel, with a little bit of your support if you are interested. It can be comparable with PPO as per paper from Deepmind, the biggest bottleneck I guess would be in generation of completitions during training.
Yes seems cool to me If u needed any helps u can dm me in discord
@sparsh35 SequenceClassificationTrainer is added
Describe the bug I want to train a reward model using Easydel with sequence classification. The classifier has been implemented in the Flax sequence classifier classes for each model, but is there any way to load a model to directly with sequence classifier and train it To Reproduce Steps to reproduce the behavior