Questions about the format of input and output data

Hello,

The list of images is a list of Pillow images object (not paths).

The answer from the batcher should be a list of strings (one string per sample in the input). Here is an example batcher for the Mistral 7B model implemented using vLLM:

from transformers import AutoTokenizer, AutoModelForCausalLM
from vllm import LLM, SamplingParams

class batcherMistral:
    def __init__(self) -> None:
        self.tokenizer = AutoTokenizer.from_pretrained(modelName)
        self.tokenizer.pad_token = self.tokenizer.eos_token
        self.LLM = LLM("mistralai/Mistral-7B-Instruct-v0.2")

    def __call__(self, prompts):
        model_inputs = [self.tokenizer.apply_chat_template(messages[0], tokenize=False) for messages in prompts]

        outputs = self.LLM.generate(model_inputs, SamplingParams(temperature=0.05, max_tokens=400))
        decoded = [output.outputs[0].text for output in outputs]

        return decoded

It is implemented as a Python Callable class. It takes the HF conversation as input, formats them with the chat template of the Mistral model, and returns the text generated by the model.

I hope this helps and I will clarify the readme in the future.

corentin-ryr / MultiMedEval

Questions about the format of input and output data #12