Closed lixu6-alt closed 2 months ago
The simplest way is to first define a model base class to initialize the model, tokenizer, and chat template of your model. Then define a def generate_inner(self, message, dataset=None):
function to output the answer. Note that this function is for a single question.
The simplest way is to first define a model base class to initialize the model, tokenizer, and chat template of your model. Then define a
def generate_inner(self, message, dataset=None):
function to output the answer. Note that this function is for a single question.
Thanks a lot for the timely response. Your explanation makes sense for me, but i am still wondering if there is any document in the git folder that teaches how to implement the generate_inner() function and what the inputs and ouputs are expected to be.
For example, if you are using a model based on the llava-next
architecture, then you need to ensure that your input's message meets form:
conversation = [
{
'role': 'user',
'content': content,
}
]
The input is passed into the model to generate the answer roughly in the following way:
prompt = self.processor.apply_chat_template(conversation, add_generation_prompt=True)
inputs = self.processor(prompt, images, return_tensors='pt').to('cuda', torch.float16)
output = self.model.generate(**inputs, **self.kwargs)
answer = self.processor.decode(output[0], skip_special_token=True)
answer = self.output_process(answer)
Everything depends on your model architecture. You can refer to the vlmeval/vlm/mantis.py
I created, which also includes how to build prompts, remove identifier tokens, and add
by the way, the images
generally is a list containing the images in PIL decoded form, and also is RGB form.
Hi, I am wondering how to evaluate a new model that developed by myself using VLMEvalKit? The README file does mention that I only need to create a function called inner_function() (maybe.. can't remeber the exact name), but does not provide any instruction about how to proceed. Can anybody help? Thanks.