ParadoxZW / LLaVA-UHD-Better

A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo
Apache License 2.0
23 stars 3 forks source link

关于input embeds维度不匹配的问题 #4

Closed Sootung closed 1 week ago

Sootung commented 1 week ago

感谢LLaVA-UHD-Better对原项目一些bug的修正。 但是我在运行此项目时遇到了https://github.com/ParadoxZW/LLaVA-UHD-Better/blob/main/llava_uhd/adapt_llava.py#L327 cur_new_input_embeds维度不匹配问题。该变量详细如下: [cur_new_input_embeds[0]: torch.Size([1, 5120]) [cur_new_input_embeds[1]: torch.Size([64, 4096]) [cur_new_input_embeds[2]: torch.Size([1, 4096]) [cur_new_input_embeds[3]: torch.Size([64, 4096]) [cur_new_input_embeds[4]: torch.Size([1, 4096]) [cur_new_input_embeds[5]: torch.Size([64, 4096]) [cur_new_input_embeds[6]: torch.Size([11, 5120]) 因此torch.cat出现问题。

ParadoxZW commented 1 week ago

https://github.com/ParadoxZW/LLaVA-UHD-Better/blob/2c1dc19d99979aa50c17165f46310e0b263b19ff/llava_uhd/vision_projector.py#L43

请您修改这个常量。因为我默认用vicuna7b训练了。我猜您用的是13B

Sootung commented 1 week ago

https://github.com/ParadoxZW/LLaVA-UHD-Better/blob/2c1dc19d99979aa50c17165f46310e0b263b19ff/llava_uhd/vision_projector.py#L43

请您修改这个常量。因为我默认用vicuna7b训练了。我猜您用的是13B

确实如此,thank you

ParadoxZW commented 1 week ago

如果您能复现13b的结果,欢迎共享您的实验分数和权重(因为我没有训13B的资源

:)

Sootung commented 5 days ago

如果您能复现13b的结果,欢迎共享您的实验分数和权重(因为我没有训13B的资源

:)

没问题,但我目前只有单张卡(80G)可用