Open hemangjoshi37a opened 1 year ago
Have you checked active learning?
@makseq While active learning is good but RLHF is quite different than that becuase it implements Reignforcement Learning for optimization of the model. All in all if you know what is RLHF it is quite different than active learning.
Yes, I know, but I expect to see your workflow in LS to achieve it. Seems you need Accept/Reject actions for your annotations? or ranking?
Yes the RLHF can be done in multiple ways. You can have yes no type or ranking type.
Basically what I propose is the have a generalized RLHF model that goes at the output side of any model and instead of having supervised training we can have unsupervised training that can be supervised by the reinforcement model.
Maybe this repo will be helpful for you: https://github.com/heartexlabs/label-studio-RLHF/
@makseq maybe it is a private repo. giving me 404 error
@hemangjoshi37a Sorry, could you please check this one? https://github.com/heartexlabs/RLHF
If anyone has any lead on this please let me know. also anyone want to collaborate on this direction please let me know.