RLHF-V / RLAIF-V

RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
233 stars 7 forks source link

我想问下 数据中logps怎么来的 #16

Closed Spring24ch closed 2 months ago

Haoye17 commented 2 months ago

Hello! Thank you for your interest in our work~ We have integrated the process of calculating logp into the training procedure, so the logp values are automatically computed for the dataset during training. The relevant code can be found here. If you have more questions, we are happy to assist!

PangziZhang523 commented 2 months ago

问下在用LLaVA-NeXT作为labeler model训练LLaVA 1.5的时候,数据中的正负样本是在训练中生成的吗还是提前生成好数据再训练

yiranyyu commented 2 months ago

是在训练前批量生成好的,因为可以加快 batch 推理效率