TrevorAshby / CodeRLHF

0 stars 0 forks source link

FT-RLHF #7

Open TrevorAshby opened 9 months ago

TrevorAshby commented 9 months ago

Fine-tune a model prior to performing RLHF. Perform RLHF on the fine-tune model.