Open yuanxion opened 1 year ago
Model: Text to video model with default inputs Summary: When promts include color, tone and hue of video will be impacted, but when color is white, results are better than other colors.
some examples:
Promt: a brown bird is flying under blue sky Video snapshot:
Promt: a red bird is flying under blue sky Video snapshot:
3.Promt: a white bird is flying under blue sky Video snapshot:
4.Promt: a white cat with blue eye is eating a small fish Video snapshot:
5.Promt: a cute dog is playing with a blue ball Video snapshot:
6.Promt: a cute dog is playing with a red ball Video snapshot:
7.Promt: a cute white dog is running on street Video snapshot:
ControlNet
We can get some video like edge, depth or pose, and then pass it to stable diffusion to control the video generation.
For a funny testing, I pass the depth video of a fox to Stable Diffusion,
and then ask it to generate a video with text prompt "oil painting of a deer, a high-quality, detailed, and professional photo", then I get a foxy deer, which looks like this: