Open LWShowTime opened 11 months ago
It works well in my side. Can you give me more detail about this?
Is this warning about bfloat16 matters? In the last version I remember the inference is fast when segment an image which took only severl seconds. However, the new 13B version inferrence slowly with the GPU 100% working.
my launch parameter :
python chat.py --version=/localpath/LISA-13B-llama2-v1-explanatory/ --vision_tower=/localpath/CLIP-vit-large-patch14/ --precision=bf16 --load_in_8bit --image_size=512
And I change several lines because the vision tower in the default config can't be changed by the launch parameter. Here are some changes in the model/llava/model/multimodal_encoder, in the class CLIPVisonTower, from line 19 to around line 30: I set self.cfg_only, self.image_processor, self.vision_tower using a local path string in the from_pretrained function.
@X-Lai And when I run version 2 in the multi-gpu devices. I got this error:
indices should be either on cpu or on the same device as the indexed tensor (cuda:0)
新版本LISA推理速度太慢了,同样的一个分割命令旧版本几秒钟,新版本需要几十分钟。