yuanxion / Text2Video-Zero

Text-to-Image Diffusion Models are Zero-Shot Video Generators
Other
2 stars 1 forks source link

Interesting observations in the generated files #16

Open yuanxion opened 1 year ago

yuanxion commented 1 year ago

ControlNet

We can get some video like edge, depth or pose, and then pass it to stable diffusion to control the video generation.

For a funny testing, I pass the depth video of a fox to Stable Diffusion,

image

and then ask it to generate a video with text prompt "oil painting of a deer, a high-quality, detailed, and professional photo", then I get a foxy deer, which looks like this: text2video_depth_control_oil painting of a deer-fox, a high-quality, detailed, and professional photo

wangleflex commented 1 year ago

Model: Text to video model with default inputs Summary: When promts include color, tone and hue of video will be impacted, but when color is white, results are better than other colors.

some examples:

  1. Promt: a brown bird is flying under blue sky Video snapshot: image

  2. Promt: a red bird is flying under blue sky Video snapshot: image

3.Promt: a white bird is flying under blue sky Video snapshot: image

4.Promt: a white cat with blue eye is eating a small fish Video snapshot: image

5.Promt: a cute dog is playing with a blue ball Video snapshot: image

6.Promt: a cute dog is playing with a red ball Video snapshot: image

7.Promt: a cute white dog is running on street Video snapshot: image