tingxueronghua / ChartLlama-code

MIT License
190 stars 19 forks source link

Embed Token Missing #18

Closed Youho99 closed 6 months ago

Youho99 commented 6 months ago

I use LLaVA-1.5-13B

I want to add a LoRa on it (any one, to test), like this one for example. However, I run into an error telling me that the embed tokens do not exist.

After extensive research, here is what I found on the internet:

For certain models, the embedding layers are trained for LoRa (when new words are present only in the finetune dataset for example) https://github.com/TUDB-Labs/multi-lora-fine-tune/issues/122

LLaMa (and therefore by extension LLaVA) is one of these models. According to the comment on lines 1334 to 1337 of this code: https://github.com/FartyPants/Training_PRO/blob/main/script.py

# modules_to_save = ["lm_head", "embed_tokens"]
# If you added new tokens to the tokenizer, you may need to save some LoRA modules because they need to know the new tokens.
# For LLaMA and Mistral, you need to save `embed_tokens` and `lm_head`. It may vary for other models.
# `embed_tokens` converts tokens to embeddings, and `lm_head` converts embeddings to token probabilities.

You must then configure the modules ["lm_head", "embed_tokens"] in modules_to_save

In your LoRa these are not configured so it makes an error. Indeed, the embedding layer and the tokens that you have regenerated are not available in the LoRa provided.

Don't hesitate to tell me if I'm wrong about something.

tingxueronghua commented 6 months ago

hi, thanks for your detailed explanation.

I did not fine-tune lm_head or embed_tokens, so ChartLlama does not contain the weights of them. The script https://github.com/tingxueronghua/ChartLlama-code/blob/main/model_vqa_lora.py should work well for inference.

I am not sure how you load LLaVA-1.5-13B. But this link https://huggingface.co/liuhaotian/llava-v1.5-13b/tree/main should contain all the modules needed.

Youho99 commented 6 months ago

I use the LLaVA template via text-generation-webui (https://github.com/oobabooga/text-generation-webui)

The multimodal model part works, but when given a LoRa, there is the problem mentioned above. Here is an issue explaining the issue in question: https://github.com/oobabooga/text-generation-webui/issues/5826

I suppose there is therefore an error in the multimodal implementation of text-generation-webui repo

So I will try this repo to use your LoRa

tingxueronghua commented 6 months ago

I use the LLaVA template via text-generation-webui (https://github.com/oobabooga/text-generation-webui)

The multimodal model part works, but when given a LoRa, there is the problem mentioned above. Here is an issue explaining the issue in question: oobabooga/text-generation-webui#5826

I suppose there is therefore an error in the multimodal implementation of text-generation-webui repo

So I will try this repo to use your LoRa

I have not used that repository before and thus cannot provide suggestions... Sorry for that. Maybe you need to do some work to transfer ChartLlama's Lora weights to it. Feel free to reopen this issue if you still have other questions.