fe1ixxu / CPO_SIMPO

This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.
35 stars 4 forks source link