Araachie / river

Efficient Video Prediction via Sparsely Conditioned Flow Matching. In ICCV, 2023.
https://araachie.github.io/river
GNU General Public License v3.0
8 stars 1 forks source link

Refinement? #4

Open bridenmj opened 4 months ago

bridenmj commented 4 months ago

Hello, I have a question regarding refinement. Refinement is carried out on the VQGAN after training with BAIRD dataset, correct? The idea is that a VQGAN is trained on BAIRD images, then the Refinement model will refine the model to reconstruct frames in a temporally consistent way. Then, once the previous two steps are done, the CFM regressor can be trained to generate sequences. Is that correct? Thanks.

Araachie commented 4 months ago

Hi! The refiner is an optional stage that is trained separately from the VQGAN and the CFM regressor. It is applied in pixel space to refine the generated frames and improve the temporal consistency of the generated videos. This is needed as the VQGAN processes the video frames independently, which may result in temporally inconsistent decoding artifacts.