New version <LISA-13B-llama2-v1-explanatory> inference VERY SLOWLY

LWShowTime commented 11 months ago

新版本LISA推理速度太慢了，同样的一个分割命令旧版本几秒钟，新版本需要几十分钟。

X-Lai commented 11 months ago

It works well in my side. Can you give me more detail about this?

LWShowTime commented 11 months ago

Is this warning about bfloat16 matters? In the last version I remember the inference is fast when segment an image which took only severl seconds. However, the new 13B version inferrence slowly with the GPU 100% working.

my launch parameter : python chat.py --version=/localpath/LISA-13B-llama2-v1-explanatory/ --vision_tower=/localpath/CLIP-vit-large-patch14/ --precision=bf16 --load_in_8bit --image_size=512

And I change several lines because the vision tower in the default config can't be changed by the launch parameter. Here are some changes in the model/llava/model/multimodal_encoder, in the class CLIPVisonTower, from line 19 to around line 30: I set self.cfg_only, self.image_processor, self.vision_tower using a local path string in the from_pretrained function.

LWShowTime commented 9 months ago

@X-Lai And when I run version 2 in the multi-gpu devices. I got this error: indices should be either on cpu or on the same device as the indexed tensor (cuda:0)

dvlab-research / LISA

New version <LISA-13B-llama2-v1-explanatory> inference VERY SLOWLY #57