Open zhaohm14 opened 2 months ago
Thanks for your wonderful work! I am interested in applying autoregressive to achieve a length-flexible output. Could this be implemented by changing the way the model infers, like the LLMs?
Thanks for your interest. What inference algorithm of LLM are you referring to specifically?
I mean, generating subsequent frames using the previous frames as input (and perhaps adding a special end token?), instead of generating 16 frames at once. Thus we can accept training videos with any length, and generate longer and more length-flexible videos.
I mean, generating subsequent frames using the previous frames as input (and perhaps adding a special end token?), instead of generating 16 frames at once. Thus we can accept training videos with any length, and generate longer and more length-flexible videos.
Not sure about performance, since the model was trained directly on 16 frames of video. You can try it, and if there are better results, welcome PR.
Thanks for your wonderful work! I am interested in applying autoregressive to achieve a length-flexible output. Could this be implemented by changing the way the model infers, like the LLMs?