How to do batch inference?

THUDM / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Apache License 2.0

7.83k stars 727 forks source link

How to do batch inference? #375

Open Florenyci opened 5 days ago

Florenyci commented 5 days ago

System Info / 系統信息

I could be running official inference script successfully

Information / 问题信息

[X] The official example scripts / 官方的示例脚本
[ ] My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

Hi guys, thanks for your cool work! I was trying to do inference with official 5B-I2V model, but when I changed test.txt, like each line including one prompt, but I got error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select), is there any way I can do batch inference?

Expected behavior / 期待表现

I hope it could do batch inference

Florenyci commented 5 days ago

To be clear, I'm not looking for parallel inference process to multiple GPU, I noticed this code but not what I want, I want to do inference with multiple prompts at the same time, which can be sequential so that I don't need to run bash inference.sh one by one

zRzRzRzRzRzRzR commented 4 days ago

Oh, if you just don't want to run inference.sh A good method is to directly save your multiple prompts, each on a new line, into a txt file, and then change input_name to your txt file in inference.yaml. It seems you are using SAT reasoning, so I don't know why this problem occurred

Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select

Please ensure your CUDA is installed correctly; the SAT code has moved the entire Transformers model to CUDA, and such issues will not arise

Florenyci commented 2 days ago

Oh, if you just don't want to run inference.sh A good method is to directly save your multiple prompts, each on a new line, into a txt file, and then change input_name to your txt file in inference.yaml. It seems you are using SAT reasoning, so I don't know why this problem occurred

Yes that's exactly what I have done, I'm using I2V model to do inference, the txt file is like this: text@@img.png text@@img.png

But it's not working as T2V model (if only one prompt it's working), I think there are some bugs in I2V mode when input multiple prompts

jsg921019 commented 14 hours ago

I got this error too. I will fix it.

FrozenT5Embedder's transformer's device get converted to cpu after first inference. This is the problem. So why does it convert to cpu? Good Question. I don't know yet. But i will fix it.