Hi authors, thanks for the well-written code, cause I'm a novice in video generation, I'm curious about the difference between the two scripts you provided, i.e., scripts/train_pyramid_flow.sh and scripts/train_pyramid_flow_without_ar.sh. It seems like the first one is for t2v and the second one is for t2i? Does that mean I need to fine-tune separately for t2i and t2v?
Hi authors, thanks for the well-written code, cause I'm a novice in video generation, I'm curious about the difference between the two scripts you provided, i.e., scripts/train_pyramid_flow.sh and scripts/train_pyramid_flow_without_ar.sh. It seems like the first one is for t2v and the second one is for t2i? Does that mean I need to fine-tune separately for t2i and t2v?