Open shiva-vardhineedi opened 4 months ago
It indicates that lmdeploy didn't get the token_embeddings weights. If you can reproduce this issue with the official internVL mini model, please kindly let us know the model's huggingface repo_id.
# head_num, vocab_size
for bin in self.input_model.bins():
emb = bin.tok_embeddings()
if emb is not None:
_vocab_size, dim = emb.shape
head_num = dim // cfg.size_per_head
break
final_cfg.update(dict(head_num=head_num, vocab_size=_vocab_size))
Hi @lvhan028 , Thanks for the reply... The locally downloaded base model works fine... The issue is with my Lora finetuned model.I uploaded my locally finetuned model on hugging face. I am able to succesfully load the model but there are no generations being done upon using that model on LMDeploy.. Can you please let me know what might be wrong with this finetuned model? its not generating response... i am getting an empty text as response as below:
@shiva-vardhineedi
Have you tried inference your hf model with transformers api ?
I inference your model with transformers api, but the result is weird.
below is my test code and the output is empty:
from transformers import AutoTokenizer, AutoModel
m = AutoModel.from_pretrained('shivavardhineedi/mini_internVL_eval', trust_remote_code=True).cuda().bfloat16()
tok = AutoTokenizer.from_pretrained('shivavardhineedi/mini_internVL_eval', trust_remote_code=True)
m.chat(tok, None, 'hello', dict(max_new_tokens=100))
# empty output
I print line 278 and 281 in https://huggingface.co/shivavardhineedi/mini_internVL_eval/blob/main/modeling_internvl_chat.py:
tensor([[ 1, 92543, 9081, 364, 2770, 657, 589, 15358, 17993, 6843,
963, 505, 4576, 11146, 451, 60628, 60384, 60721, 62442, 60752,
699, 92542, 92543, 1008, 364, 15115, 92542, 92543, 525, 11353,
364]], device='cuda:0')
tensor([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0]], device='cuda:0')
@irexyc i used LoRA for finetuning and uploaded the model here by copying some missing files from original model. Why am i getting this?
Checklist
Describe the bug
Looks like lmdeploy as default downloads models from Hugging face and there is a mention of how to make it download form ModelSource but there is no mention of how to use my locally available model. I have a finetuned internVL mini model and am referencing it in pipeline like below:
when i try to use it for inference using lmdeploy i am facing this issue...
Reproduction
Environment
Error traceback