Open Zhangwenyao1 opened 2 weeks ago
Thanks for your interest in our work!
Yes, we select either the transitional or final frames as target frames for reconstruction, conditioned by the other two frames and the task description.
As a representation learning framework, MPI does not prioritize the visual quality of reconstructed images. And, consequently, we do not include qualitative results in our paper.
Thanks for your great work!
And I want to know what is target frame prediction? Does it mean that you will reconstruct the whole images under the condition of the other two images? Is there any visualiztion of reconstructed frames?