Closed ThierryDeruyttere closed 1 year ago
Hi,
LLaVA-RLHF is trained with LoRA and the bfloat16 data type. While it is possible to merge the LoRA weights and thus enable inference with libraries such as TGI and vLLM, we found the merged weights can lead to degenerated performance. Therefore, we recommend directly loading the LoRA weights with the PEFT-LoRA framework when evaluating our models.
Hi,
Could you maybe provide merged models? i.e. merge the Lora weight with the base model. This would help load the model directly into 4bit. Thanks!