baaivision / Emu3

Next-Token Prediction is All You Need
Apache License 2.0
303 stars 2 forks source link

Video Tokenizer Inference #6

Open Epiphqny opened 11 hours ago

Epiphqny commented 11 hours ago

Dear authors, thanks for your awesome work. Could you please provide a inference code for the video tokenizer reconstruction. I wrote one but found that the reconstruction results is bad after 4 frames, could any reason be possible with this issue? Below are the results of the 1st, 5th and 9th frames. frame_0000 frame_0004 frame_0008

matbee-eth commented 5 hours ago

That looks like diffusion happening without any noise applied, no?