Closed diogo-cruz closed 1 year ago
We could use https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest tldr: trained on tweets for 3 classes: negative, neutral, positive.
reward can be the cross entropy with the desired class.
The only thing to make sure about for this model, is whether it fits on VRAM at the same time at GPT-2. It should not occupy much more than 1GB.
current implementation at https://github.com/allenai/RL4LMs/commit/e147dd3dee0e539cb96d97e3ffb851ef63c85fc0
largely untested end to end, but individually works
current reward model was trained on twitter data. Given meeting outcome, agreeing to fine-tune on same dataset as sentiment model, I will update this to use a model based on IMDB.
features
This consists of: