Open GloryyrolG opened 1 week ago
E.g., i found For training 29x480p_i2v
from 93x480p_i2v
with a stronger prior, the reconstruction capability can be adapted in just a few steps; and after about 100 steps, a certain degree of temporal video generation can be restored.
Hi @LinB203 @qqingzheng @clownrat6 et al.,
may i ask What we should pay special attention to when resuming training from a checkpoint with a different temporal length, e.g. train 93 with a 29 ckpt (stage 5) or vice versa? The 93-frame model does not generate good 29-frame videos due to the use of positional embeddings. thx!