Image2Video Effect - Githubissues

mymusise commented 1 month ago

Is this effect normal for Image2Video?

prompt: a girl with blue eyes standing in front of a backdrop of trees, water, and a clear blue sky. She is wearing a t-shirt and shorts, and her face is illuminated by the sun. hyper quality, Ultra HD, 8K

https://github.com/user-attachments/assets/bd9706b9-633e-4d6c-b618-0fc09d30d951

kijai commented 1 month ago

For anime style, I haven't really gotten anything much better. Best one so far is just something like this:

https://github.com/user-attachments/assets/2bf9b38b-df99-4805-9f4e-7de39c81a02e

mptorr commented 1 month ago

@kijai how does one output only the video portion (block on the right side)? my I2V gens using the workflow also create this side-by-side video but I'd like to have only the video portion (without static image on the left)

lijackcoder commented 1 month ago

@kijai what settings do you use for img2vid? when I use defaults, it usually ends in a bunch of artifacts, or is that normal? Also, does any resolution work? or how did you get it to work if it isn't 1280x768?

mptorr commented 1 month ago

@kijai how does one output only the video portion (block on the right side)? my I2V gens using the workflow also create this side-by-side video but I'd like to have only the video portion (without static image on the left)

OK I solved this by bypassing entirely the Image Concat node—perhaps a suggestion would be to place a switch to turn on/off the side-by-side format (ie, concatenate image with video, yes/no).

@lijackcoder the workflow worked out-of-the-box for me, make sure to download the correct models/ckpt as well as controlnet and place them in the appropriate folders. I didn't get any artifacts, however the gens so far are less stable/usable than CogVideo I2V.

lijackcoder commented 1 month ago

@mptorr Apparently there was an updated 14 hours ago for the both the 384p and 768p models. I downloaded the versions before that update. No idea if it will make a difference or not, but I will test it out.

But my question is what do you mean with controlnet?

as well as controlnet

mptorr commented 1 month ago

@mptorr Apparently there was an updated 14 hours ago for the both the 384p and 768p models. I downloaded the versions before that update. No idea if it will make a difference or not, but I will test it out.

But my question is what do you mean with controlnet?

as well as controlnet

it's a component that's necessary to run this workflow... you must follow the instructions @kijai indicates on the main page, and have this directory structure within your custom-nodes folder. Note that doing only git clone will usually not download the larger files, you can do that manually or need to install git LFS.

\ComfyUI\models\pyramidflow\pyramid-flow-sd3 ├───causal_video_vae │ config.json │ diffusion_pytorch_model.safetensors │ ├───diffusion_transformer_384p │ config.json │ diffusion_pytorch_model.safetensors │ ├───diffusion_transformer_768p │ config.json │ diffusion_pytorch_model.safetensors │ ├───text_encoder │ config.json │ model.safetensors │ ├───text_encoder_2 │ config.json │ model.safetensors │ ├───text_encoder_3 │ config.json │ model-00001-of-00002.safetensors │ model-00002-of-00002.safetensors │ model.safetensors.index.json │ ├───tokenizer │ merges.txt │ special_tokens_map.json │ tokenizer_config.json │ vocab.json │ ├───tokenizer_2 │ merges.txt │ special_tokens_map.json │ tokenizer_config.json │ vocab.json │ └───tokenizer_3 special_tokens_map.json spiece.model tokenizer.json tokenizer_config.json

lijackcoder commented 1 month ago

okay thanks

hassakajin commented 1 month ago

I just let the model downloader "from the workflow" do it's work, thought it seems it is working fine for me by doing that for I2V, it's running on only 8.7 GB Vram, so i guess i F* something up? i have no idea... just leaving this here because tomorrow i'll have time to actually test if it does anything better than CogVideo text to video/i2v etc. it's running at 20/it tho, kinda slow.

kijai commented 1 month ago

I just let the model downloader "from the workflow" do it's work, thought it seems it is working fine for me by doing that for I2V, it's running on only 8.7 GB Vram, so i guess i F* something up? i have no idea... just leaving this here because tomorrow i'll have time to actually test if it does anything better than CogVideo text to video/i2v etc. it's running at 20/it tho, kinda slow.

That's expected VRAM use at the beginning, it ramps up a bit as the model works in stages.

mptorr commented 1 month ago

I just let the model downloader "from the workflow" do it's work, thought it seems it is working fine for me by doing that for I2V, it's running on only 8.7 GB Vram, so i guess i F* something up? i have no idea... just leaving this here because tomorrow i'll have time to actually test if it does anything better than CogVideo text to video/i2v etc. it's running at 20/it tho, kinda slow.

FWIW I get 13.8s / it on a local GTX 4090, when doing I2V (the first gen is always slower and gives 17s / it).

hassakajin commented 1 month ago

FWIW I get 13.8s / it on a local GTX 4090, when doing I2V (the first gen is always slower and gives 17s / it).

yeah i noticed that just now, first gen starts at 20s / it and goes to 28s / it on my 4070ti super first gen, the next one it starts at 16s / it. but the results are of high quality in terms of image sharpness, not much luck with guiding any motion with prompt for i2V for now, i'll leave it at that for today.

kijai said the model works in stages, and i can literally hear that as the wind turbines on my card go wild, by the way it has 4 fans lol, gladly i have a undervolt preset.

kijai commented 1 month ago

FWIW I get 13.8s / it on a local GTX 4090, when doing I2V (the first gen is always slower and gives 17s / it).

yeah i noticed that just now, first gen starts at 20s / it and goes to 28s / it on my 4070ti super first gen, the next one it starts at 16s / it. but the results are of high quality in terms of image sharpness, not much luck with guiding any motion with prompt for i2V for now, i'll leave it at that for today.

kijai said the model works in stages, and i can literally hear that as the wind turbines on my card go wild, by the way it has 4 fans lol, gladly i have a undervolt preset.

I can hear what stage it's on from the coil whine of the GPU alone.. it's also possible to set the steps separately for different stages (3 of them by default), can cut down the sampling time a lot, unsure which way is better though, it felt like it's better to use less steps at earlier stages and more on latter.

lijackcoder commented 1 month ago

FWIW I get 13.8s / it on a local GTX 4090, when doing I2V (the first gen is always slower and gives 17s / it).

yeah i noticed that just now, first gen starts at 20s / it and goes to 28s / it on my 4070ti super first gen, the next one it starts at 16s / it. but the results are of high quality in terms of image sharpness, not much luck with guiding any motion with prompt for i2V for now, i'll leave it at that for today. kijai said the model works in stages, and i can literally hear that as the wind turbines on my card go wild, by the way it has 4 fans lol, gladly i have a undervolt preset.

I can hear what stage it's on from the coil whine of the GPU alone.. it's also possible to set the steps separately for different stages (3 of them by default), can cut down the sampling time a lot, unsure which way is better though, it felt like it's better to use less steps at earlier stages and more on latter.

How long does a generation for you usually take?

sipie800 commented 1 month ago

it's not video, it's flash animate back in 2000s

kijai / ComfyUI-PyramidFlowWrapper

Image2Video Effect #11