Closed skrya closed 4 months ago
Thanks for your interest in our work. I have fixed the dimension error, and it should work properly now. If you have additional problems, please let me know.
Regarding the training time and resources, the first-step training took up to 3~4 days for sky timelapse on 4 A100 GPUS (80GB).
Thanks for response. I will take a look at it. There was a bug during evaluation of sky_timelapse during training. The eval code did not have suitable code for video (some lines were missing). Hope that was also fixed?
Thank you for letting me know. We have fixed several issues regarding the video.
Dear authors,
Thank you for providing this wonderful code base. I tried running the code to generate videos using the sky_timelapse dataset. However, I encountered the following error:
I am trying to run the first stage with the following command.
CUDA_VISIBLE_DEVICES=0,1 accelerate launch --multi_gpu --num_processes=2 main.py --exp d2c-vae --configs configs/d2c-vae/skytimelapse_gan.yaml
'/home/sudhir/anaconda3/envs/vid/lib/python3.8/site-packages/torch/nn/modules/conv.py:456: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.) return F.conv2d(input, weight, bias, self.stride, 0%| | 0/600 00:01<?, ?it/s: Traceback (most recent call last): rank1: File "main.py", line 65, in
rank1: File "main.py", line 27, in main
rank1: File "/home/sudhir/Projects/Adobe/DDMI/exp/stage.py", line 151, in first_stage_train
rank1: File "/home/sudhir/Projects/Adobe/DDMI/tools/d2c_vae/video.py", line 212, in train rank1: inputs_2d = torch.gather(x, 2, frame_idx_selected).squeeze(2) rank1: RuntimeError: Size does not match at dimension 1 expected index [1, 64, 1, 256, 256] to be smaller than self [1, 3, 16, 256, 256] apart from dimension 2'
Could you please advise on how I can fix this issue and proceed forward?
Additionally, could you let me know the training time for the sky_timelapse dataset and the number and specifications (GB) of the GPUs used?
Thanks?