luogen1996 / LaVIN

[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
500 stars 36 forks source link

Computing output likelihoods with the model #1

Open vishaal27 opened 1 year ago

vishaal27 commented 1 year ago

Hi, is it possible to get the tokenwise log-likelihood scores of different outputs from the model?

The use-case would be something like: Given an interleaved image/text input and a list of output text candidates, we should be able to get a score for each output candidate and then return their ranked list, rather than generating the outputs directly. This would be close to how LLMs are evaluated on MCQ tasks. An example from the T0 paper Page 6 (https://arxiv.org/pdf/2110.08207.pdf):

For tasks that involve choosing the correct completion from several options (e.g. multiple choice
question answering), we follow Brown et al. (2020) and use rank classification to evaluate our
model: we compute the log-likelihood of each of the target options under the fine-tuned model and
select the option with the highest log-likelihood as the prediction. For simplicity, we do not apply
length normalization to the log-likelihoods of the target options.

Is it straightforward to do this with LaVIN? I assume since the LM is built with transformers there should be a possibility to use output score functions already implemented (haven't dug into this yet)?

deep-matter commented 1 year ago

did you Re-run the Implementation?
because I am trying to Re-implement the same Modeling design but instead of LLaMa I would like to use using LoRa and I was think also formalizing Loss For multi-Task Learning such as MCO