kvablack / ddpo-pytorch

DDPO for finetuning diffusion models, implemented in PyTorch with LoRA support
MIT License
424 stars 42 forks source link

On reproducing LLaVA alignment experiments. #5

Closed bhattg closed 1 year ago

bhattg commented 1 year ago

Hi! I've a couple of questions on the LLaVa alignment:

kvablack commented 1 year ago

I used the 13B variant of LLaVA, and all experiments were run with fp16 precision.

bhattg commented 1 year ago

Thanks, closing the issue now. :-)