PKU-Alignment / align-anything

Align Anything: Training All-modality Model with Feedback
Apache License 2.0
100 stars 27 forks source link

[Feature Request] Need to support auto-regressive VLMs #13

Closed htlou closed 1 month ago

htlou commented 1 month ago

Required prerequisites

Motivation

As current VLM support mainly focuses on encoder-decoder style models (which encode and decode multimodal information as a hidden state tensor), we need to support auto-regressive VLMs including Chameleon & Anole (which encode and decode multimodal information as tokens).

Solution

No response

Alternatives

No response

Additional context

No response

htlou commented 1 month ago

Is now working on https://github.com/PKU-Alignment/align-anything/pull/36

htlou commented 1 month ago

now https://github.com/PKU-Alignment/align-anything/pull/36 is merged, the SFT support is done