JackAILab / ConsistentID

Customized ID Consistent for human
MIT License
845 stars 76 forks source link

vqa_LLVA and vqa_LLVA_more_face_detail #42

Closed gaoyixuan111 closed 4 months ago

gaoyixuan111 commented 5 months ago

"When I use LLAVA to generate the corresponding captions, the speed is very slow, taking about one minute to complete the vqa_LLVA and vqa_LLVA_more_face_detail descriptions for a single image."

JackAILab commented 5 months ago

Hi, @gaoyixuan111 The llava model is time-consuming to load. After loading, theoretically, I use one 3090 GPU and can complete the caption of a single image within 1.5 seconds. Check your code to make sure that the model is not loaded repeatedly in a loop.

gaoyixuan111 commented 5 months ago

@JackAILab Could you share the training time for the model on 8 V100 GPUs and provide more training details?

JackAILab commented 4 months ago

Hi, @gaoyixuan111 , you can refer to this issue50~