Vision-CAIR / MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
https://minigpt-4.github.io
BSD 3-Clause "New" or "Revised" License
25.4k stars 2.91k forks source link

Error when preparing Vicuna v0 weights? Can I use Vicuna v1.1? #25

Closed JosephPai closed 1 year ago

JosephPai commented 1 year ago

Hi,

Thanks for releasing the interesting work. I'm trying to deploy it on my server. However, I encountered some difficulties when preparing Vicuna weights.

When apply the delta weights of Vicuna to the original LLaMa weights, I always got vocab mismatch error like this: RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0 I've searched issues in the FastChat repo but didn't find effective solution. The author of FastChat suggests to directly move to Vicuna v1.1 since they have fixed a lot of issues in the new version.

I'd like to ask 1) Do you have any experiences/suggestions to solve the issues I encountered? 2) Do you think it's feasible to directly move to Vicuna v1.1? I noticed that in the new version, they have some changes, like the separator has been changed from ### to </s>. I'm not sure if it is compatible with MiniGPT-4.

Thanks!

linkct commented 1 year ago

I encountered the same issue. Note that per the Vicuna official repo, Vicuna-v0 is only compatible with FastChat version <= v0.1.10, so pip install fschat==0.1.10 solved it for me.

MrToy commented 1 year ago

same issue,looking forward to providing vicuna v1.1

JosephPai commented 1 year ago

I encountered the same issue. Note that per the Vicuna official repo, Vicuna-v0 is only compatible with FastChat version <= v0.1.10, so pip install fschat==0.1.10 solved it for me.

@linkct This solution works for me. Thanks!

LARRYMIN commented 1 year ago

can change the source code of fastchat.model.apply_delta

if delta_state_dict[name].size(0)==32001:
    state_dict[name] += delta_state_dict[name][:32000, :]
else:
    state_dict[name] += delta_state_dict[name]
feymanwang commented 1 year ago

can you show the line number to add?

WynMew commented 1 year ago

can you show the line number to add?

line 107 I guess.