Chameleon is a family of early-fusion token-based mixed-modal models capable of understanding and generating images and text in any arbitrary sequence. We believe that support for this model will further enhance the user experience of Align-Anything.
Required prerequisites
Motivation
Chameleon is a family of early-fusion token-based mixed-modal models capable of understanding and generating images and text in any arbitrary sequence. We believe that support for this model will further enhance the user experience of Align-Anything.
Solution
No response
Alternatives
No response
Additional context
No response