showlab / Tune-A-Video

[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
https://tuneavideo.github.io
Apache License 2.0
4.15k stars 376 forks source link

ask for help #14

Closed zcdliuwei closed 1 year ago

zcdliuwei commented 1 year ago

RuntimeError: CUDA out of memory. Tried to allocate 4.00 GiB (GPU 0; 23.70 GiB total capacity; 8.31 GiB already allocated; 254.06 MiB free; 8.76 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

My server environment is: image

zhangjiewu commented 1 year ago

Is xformers installed and enabled?

zcdliuwei commented 1 year ago

No I execute your training code completely according to your requirements file, except for xformers. Because I always fail to install xformers when I execute pip install xformers, can you tell me how to install xformers? My server environment is shown in the figure above.

Thank you very much

zhangjiewu commented 1 year ago

You need to install xformers to avoid OOM. For installation, please follow the official instruction.

zcdliuwei commented 1 year ago

I executed pip install xformers==0.16rc425 to install xformers, and torch==1.13.1, torch vision==0.14.1, and then encountered the following error during training:

image

zhangjiewu commented 1 year ago

It looks like you're using distributed training, which is not supported now. To avoid using multiple GPUs, you may specify one GPU device by export CUDA_VISIBLE_DEVICES=GPU_ID.

zcdliuwei commented 1 year ago

I specified a visible device like below CUDA_VISIBLE_DEVICES=0 accelerate launch train_tuneavideo.py --config="configs/man-surfing.yaml" but I still encountered the same error

zhangjiewu commented 1 year ago

can you try using the same environment as written in the requirements.txt? in particular, torch==1.12.1.

zcdliuwei commented 1 year ago

I have reinstalled the environment and can run normally, but the effect is not as good as that in your readmen file. Can you give me any suggestions for improvement

The following is the GIF I generated

a panda is surfing

sks mr potato head is surfing in the forest

zhangjiewu commented 1 year ago

https://github.com/showlab/Tune-A-Video/issues/15#issuecomment-1422110060 this might help

StevensXu commented 1 year ago

It looks like you're using distributed training, which is not supported now. To avoid using multiple GPUs, you may specify one GPU device by export CUDA_VISIBLE_DEVICES=GPU_ID.

Why doesn't it support multi-gpu training, will it be difficult to modify the code to multi-gpu training?

mayuelala commented 1 year ago

It looks like you're using distributed training, which is not supported now. To avoid using multiple GPUs, you may specify one GPU device by export CUDA_VISIBLE_DEVICES=GPU_ID.

Why doesn't it support multi-gpu training, will it be difficult to modify the code to multi-gpu training?

I tried multi-gpu training, but results are bad