xgen-mm（BLIP-3）推理出错

Gaojinpeng8 commented 2 weeks ago

在模型准备阶段，我先使用convert_hf_model.py代码将xgen-mm-phi3-mini-instruct-interleave-r-v1.5模型转为了xgen-mm-phi3-mini-instruct-interleave-r-v1.5.pt文件。

然后我将inference.ipynb文件中的model_ckpt路径改为了xgen-mm-phi3-mini-instruct-interleave-r-v1.5.pt对应的路径，运行代码会出现KeyError: 'model_state_dict'的报错。

将ckpt = torch.load(cfg.ckpt_pth)["model_state_dict"]改为ckpt = torch.load(cfg.ckpt_pth)能解决上面的报错。

我继续运行代码块到generated_text = model.generate(...位置出现了RuntimeError: shape '[-1, 0]' is invalid for input of size 821的报错，请问该如何解决这个问题？

azshue commented 2 weeks ago

Hi @Gaojinpeng8 ,

Thank you for trying out our code.

We noticed this issue with newer version of transformers and we're investigating it. In the meantime, could you try using transformers==4.41.2? The inference code should be working with the version.

I'll update here when I have any updates. Thanks!

azshue commented 1 week ago

Update:

The error occurs because the previous modeling_phi3 code from microsoft/Phi-3-mini-4k-instruct doesn't work with the latest transformers library. If you want to use latest transformers library, this issue can be solved by using the modeling_phi3 code from the transformers library.

You can do this by setting trust_remote_code=False when calling AutoModelForCausalLM.from_pretrained.

salesforce / LAVIS

xgen-mm（BLIP-3）推理出错 #740