llava-rlhf / LLaVA-RLHF

Aligning LMMs with Factually Augmented RLHF
https://llava-rlhf.github.io/
GNU General Public License v3.0
315 stars 21 forks source link

Merge the models #9

Closed ThierryDeruyttere closed 1 year ago

ThierryDeruyttere commented 1 year ago

Hi,

Could you maybe provide merged models? i.e. merge the Lora weight with the base model. This would help load the model directly into 4bit. Thanks!

Edward-Sun commented 1 year ago

Hi,

LLaVA-RLHF is trained with LoRA and the bfloat16 data type. While it is possible to merge the LoRA weights and thus enable inference with libraries such as TGI and vLLM, we found the merged weights can lead to degenerated performance. Therefore, we recommend directly loading the LoRA weights with the PEFT-LoRA framework when evaluating our models.