Open TrevorAshby opened 9 months ago
Fine-tune a model prior to performing RLHF. Perform RLHF on the fine-tune model.
Fine-tune a model prior to performing RLHF. Perform RLHF on the fine-tune model.