ExponentialML / Text-To-Video-Finetuning

Finetune ModelScope's Text To Video model using Diffusers 🧨
MIT License
656 stars 104 forks source link

About VideoLDM #55

Open suzhenghang opened 1 year ago

suzhenghang commented 1 year ago

Do you have any knowledge of VideoLDM, and is it possible to integrate its algorithms to further enhance the capabilities of current models, such as generating longer videos?

ExponentialML commented 1 year ago

ModelScope's implementation is very similar to theirs in the sense that they add a temporal dimension to the model. For long video generation, you could follow this PR which uses a similar idea https://github.com/ExponentialML/Text-To-Video-Finetuning/pull/27.

suzhenghang commented 1 year ago

Many thanks. Do you have any recommendations for AI video flicker removal?

ExponentialML commented 1 year ago

Many thanks. Do you have any recommendations for AI video flicker removal?

No problem! There was a paper released somewhat recently that supposedly tackles this problem, but I can't seem to find it.

For now, I think using a tool outside of the machine learning domain would suffice.

suzhenghang commented 1 year ago

Nice,have you tried any tools to alleviate the flickering?

Many thanks. Do you have any recommendations for AI video flicker removal?

No problem! There was a paper released somewhat recently that supposedly tackles this problem, but I can't seem to find it.

For now, I think using a tool outside of the machine learning domain would suffice.

suzhenghang commented 1 year ago

ModelScope's implementation is very similar to theirs in the sense that they add a temporal dimension to the model. For long video generation, you could follow this PR which uses a similar idea #27.

Do you have plans to integrate this PR later on?

ExponentialML commented 1 year ago

Nice,have you tried any tools to alleviate the flickering?

Many thanks. Do you have any recommendations for AI video flicker removal?

No problem! There was a paper released somewhat recently that supposedly tackles this problem, but I can't seem to find it. For now, I think using a tool outside of the machine learning domain would suffice.

Found it :wink: .

https://github.com/chenyanglei/all-in-one-deflicker

kabachuha commented 1 year ago

This person is trying to implement it in Diffusers, last commit just yesterday

https://github.com/srpkdyy/VideoLDM