Hi authors, when I tested the reward model you given, I found the reward scores are always below 0. May I get your reward prompt when you trained this model and applied it?
Also, does the prompt in LLaVA you used is identical to the original LLaVA prompt?
Hi authors, when I tested the reward model you given, I found the reward scores are always below 0. May I get your reward prompt when you trained this model and applied it?
Also, does the prompt in LLaVA you used is identical to the original LLaVA prompt?
Thanks.