Vchitect / Latte

Latte: Latent Diffusion Transformer for Video Generation.
Apache License 2.0
1.45k stars 147 forks source link

torchrun --nnodes=1 --nproc_per_node=2 train_with_img.py --config ./configs/sky/sky_img_train.yaml error #33

Open dpyneo opened 4 months ago

dpyneo commented 4 months ago

A very impressive job. There are several issues when using SkyTimelapse data for image and video pre training

  1. From utils import (clip_gradnorm, create'logger, update_ema,

Requires_grad, cleanup, create_tensorboard,

Write_tensorboard, setup_distributed, fetch files by numbers,

Fetch files by numbers and separation content motion were not found in utils in get_experiment_dir, separation content motion,)

"ImportError: Unable to import the name 'fetch files by num bers' from' utils' (Last/utils. py). After commenting out the corresponding file, it is sufficient. Can you ask what this mainly does? Does it directly use the original video?" Commenting out directly is not a problem, right? "”

  1. If args. dataset=='webvideo2mlaion ':

Traceback (most recent call last):

File "/data/zqzx/latte/latte_main/latte/train_with_img. py", line 361, in

Main (OmegaConf. load (args. config))

File "/data/zqzx/latte/latte_main/latte/train_with_img. py", line 221, in main

Logger. info (f "Dataset contains {len (dataset):,} videos ({args. webvideo_data_path})")

File "/data/miniconde3/envs/yxl/lib/python3.9/site packages/omegaconf/docconfiguration. py", line 355, in _getattr__

Self_ Formad_and_raise()

This can be directly solved by adding the corresponding solution to the sky_img_train.yaml corresponding to the actual video, which is. mp4? Or can we think of our own video dataset through this path?

  1. If args. test_run: After commenting it out directly, it can be run now

Thank you.

maxin-cn commented 4 months ago

A very impressive job. There are several issues when using SkyTimelapse data for image and video pre training

  1. From utils import (clip_gradnorm, create'logger, update_ema,

Requires_grad, cleanup, create_tensorboard,

Write_tensorboard, setup_distributed, fetch files by numbers,

Fetch files by numbers and separation content motion were not found in utils in get_experiment_dir, separation content motion,)

"ImportError: Unable to import the name 'fetch files by num bers' from' utils' (Last/utils. py). After commenting out the corresponding file, it is sufficient. Can you ask what this mainly does? Does it directly use the original video?" Commenting out directly is not a problem, right? "”

  1. If args. dataset=='webvideo2mlaion ':

Traceback (most recent call last):

File "/data/zqzx/latte/latte_main/latte/train_with_img. py", line 361, in

Main (OmegaConf. load (args. config))

File "/data/zqzx/latte/latte_main/latte/train_with_img. py", line 221, in main

Logger. info (f "Dataset contains {len (dataset):,} videos ({args. webvideo_data_path})")

File "/data/miniconde3/envs/yxl/lib/python3.9/site packages/omegaconf/docconfiguration. py", line 355, in _getattr__

Self_ Formad_and_raise()

This can be directly solved by adding the corresponding solution to the sky_img_train.yaml corresponding to the actual video, which is. mp4? Or can we think of our own video dataset through this path?

  1. If args. test_run: After commenting it out directly, it can be run now

Thank you.

  1. These functions are auxiliary functions for my previous experiments, you do not need them. I've removed the redundant functions.
  2. The training code for T2V is not currently supported in the Latte repo. However, we provide a complete training code for four datasets such as UCF101, SkyTimelapse, etc. (including video-image joint training). You can refer to them to modify it for your data set. Please use the UCF101 dataset as the base if your dataset has classes.
  3. args. test_run is also an auxiliary function. I will clean it.
dpyneo commented 4 months ago

Thank you very much