Closed Daming-TF closed 11 months ago
Training only with videos long enough time seems better. If you have more resources, I suggest to use all 10M webvid videos and set more network parameters trainable. Another problem is sd's vae is not for 256x256, it could perform better if doing some finetune.
Hello, did you use the LAION400M training to find it useless and then use the 2M data of the webvid dataset to retrain for 2 months, and then the effect is good?