PKU-Alignment / align-anything

Align Anything: Training All-modality Model with Feedback
https://align-anything.readthedocs.io
Apache License 2.0
260 stars 47 forks source link

[Feature Request] Need to support auto-regressive VLMs #13

Closed htlou closed 3 months ago

htlou commented 4 months ago

Required prerequisites

Motivation

As current VLM support mainly focuses on encoder-decoder style models (which encode and decode multimodal information as a hidden state tensor), we need to support auto-regressive VLMs including Chameleon & Anole (which encode and decode multimodal information as tokens).

Solution

No response

Alternatives

No response

Additional context

No response

htlou commented 4 months ago

Is now working on https://github.com/PKU-Alignment/align-anything/pull/36

htlou commented 3 months ago

now https://github.com/PKU-Alignment/align-anything/pull/36 is merged, the SFT support is done