This repository contains demos I made with the Transformers library by HuggingFace.
9.56k
stars
1.46k
forks
source link
LLaVa-NeXT/Fine_tune_LLaVaNeXT_on_a_custom_dataset_(with_PyTorch_Lightning).ipynb fails at training #461
Open
7AtAri opened 3 months ago
ValueError Traceback (most recent call last) in <cell line: 19>()
17 )
18
---> 19 trainer.fit(model_module)
24 frames /usr/local/lib/python3.10/dist-packages/transformers/models/llava_next/modeling_llava_next.py in _merge_input_ids_with_image_features(self, image_features, feature_lens, inputs_embeds, input_ids, attention_mask, position_ids, labels, image_token_index, ignore_index) 541 total_num_special_image_tokens = torch.sum(special_image_token_mask) 542 if total_num_special_image_tokens != num_images: --> 543 raise ValueError( 544 f"Number of image tokens in input_ids ({total_num_special_image_tokens}) different from num_images ({num_images})." 545 )
ValueError: Number of image tokens in input_ids (0) different from num_images (1).
this error appears only after fixing another error concerning the chat_template:
in the collate functions: chat_template = ( "{% if messages[0]['role'] == 'instruction' %}" "Instruction: {{- messages[0]['content'] }}\n" "{% set messages = messages[1:] %}" "{% endif %}" "{% for message in messages %}" "Question:" "{% for line in message['query'] %}" "{% if line['type'] == 'text' %}" "{{- line['text'] }}" "{% elif line['type'] == 'image' %}" "{{ '' }}"
"{% endif %}"
"{% endfor %}"
"\n"
"{% if 'answer' in message %}"
"Short answer: "
"{% for line in message['answer'] %}"
"{% if line['type'] == 'text' %}"
"{{- line['text'] }}"
"{% elif line['type'] == 'image' %}"
"{{ '' }}"
"{% endif %}"
"{% endfor %}"
"\n"
"{% endif %}"
"\n"
"{% endfor %}"
"{% if add_generation_prompt %}"
"Short answer: "
"{% endif %}"
)
text_prompt = processor.tokenizer.apply_chat_template(conversation, chat_template=chat_template, add_generation_prompt=True)
https://github.com/huggingface/transformers/issues/32303