Closed XuyaoWang closed 3 weeks ago
As fields such as V2T (Video to Text) and I2T (Image to Text) mature, we need a any 2 text model to SFT and XPO to align
See our PR
No response
Required prerequisites
Motivation
As fields such as V2T (Video to Text) and I2T (Image to Text) mature, we need a any 2 text model to SFT and XPO to align
Solution
See our PR
Alternatives
No response
Additional context
No response