Open zhanghongyong123456 opened 1 year ago
same result with yours
Inaccurate reconstruction is due to: (i) inaccurate DDIM inversion, (ii) imperfect VAE latent space autoencoder.
Interestingly, our method may still overcome issues with the DDIM inversion thanks to our TokenFlow injection. For example, the editing result for this video does not exhibit these artifcats that occur in the DDIM inversion process.
Yes, I also experienced this issue. In my experience, it happens because each frame is inverted independently (and becomes severe when fewer DDIM steps are used). However, if you use Cross-Frame attention and Tokenflow propagation during DDIM inversion and reconstruction, this issue gets resolved even for reconstructed video
Yes, I also experienced this issue. In my experience, it happens because each frame is inverted independently (and becomes severe when fewer DDIM steps are used). However, if you use Cross-Frame attention and Tokenflow propagation during DDIM inversion and reconstruction, this issue gets resolved even for reconstructed video
I am in total agreement.
@anime26398 Then Tokenflow propagation is implemented in this repo? I can't find 'compute nn fields' and 'tokenflow propagation' it just looks using PnP instead.
test cmd: python preprocess.py
https://github.com/omerbt/TokenFlow/assets/48466610/3fee547d-f65c-4af0-bee7-5712229c582d