Question regarding warning in mPLUG_OwlForConditionalGeneration class

pyogher commented 1 year ago

Hi!

I noticed a warning in the mPLUG_OwlForConditionalGeneration class:

"The language_model is not in the hf_device_map dictionary and you are running your script" " in a multi-GPU environment. This may lead to unexpected behavior when using accelerate." " Please pass a device_map that contains language_model to remove this warning." " Please refer to https://github.com/huggingface/blog/blob/main/accelerate-large-models.md for" " more details on creating a device_map for large models."

It is evident that the suggestion is against splitting every layer of LLaMA onto different GPUs. I'm curious about the reason behind this recommendation. Additionally, I have been exploring the evaluation of LVLMs recently and I need to perform extensive inference in different benchmarks. Do you have any suggestions to improve the efficiency of inference on mPLUG-Owl in a multi-GPU environment? I've noticed that the GPU utilization is not optimal. Thank you!

MAGAer13 commented 1 year ago

Hi, thanks for having interest in our work. Currently we do not ensure the multiple gpu support. For enabling the multi-gpu inference, we recommand to setup each GPU with an individual model. And you can self-design a inference scheduling program to do this.

pyogher commented 1 year ago

Thank you for your response. I really appreciate your help.

CrazyBrick commented 11 months ago

Thank you for your response. I really appreciate your help.

could you share your solution?

X-PLUG / mPLUG-Owl

Question regarding warning in mPLUG_OwlForConditionalGeneration class #79