Open bridenmj opened 4 months ago
Hi! The refiner is an optional stage that is trained separately from the VQGAN and the CFM regressor. It is applied in pixel space to refine the generated frames and improve the temporal consistency of the generated videos. This is needed as the VQGAN processes the video frames independently, which may result in temporally inconsistent decoding artifacts.
Hello, I have a question regarding refinement. Refinement is carried out on the VQGAN after training with BAIRD dataset, correct? The idea is that a VQGAN is trained on BAIRD images, then the Refinement model will refine the model to reconstruct frames in a temporally consistent way. Then, once the previous two steps are done, the CFM regressor can be trained to generate sequences. Is that correct? Thanks.