A node suite for ComfyUI that allows you to load image sequence and generate new image sequence with different styles or content.
Original repo: https://github.com/sylym/stable-diffusion-vid2vid
Firstly, install comfyui
Then run:
cd ComfyUI/custom_nodes
git clone https://github.com/sylym/comfy_vid2vid
cd comfy_vid2vid
Next, download dependencies:
python -m pip install -r requirements.txt
For ComfyUI portable standalone build:
#You may need to replace "..\..\..\python_embeded\python.exe" depends your python_embeded location
..\..\..\python_embeded\python.exe -m pip install -r requirements.txt
All nodes are classified under the vid2vid category. For some workflow examples you can check out:
Load image sequence from a folder.
Inputs:
Outputs:
IMAGE
MASK_SEQUENCE
Parameters:
image_sequence_folder
input
folder. sample_start_idx
sample_frame_rate
n_sample_frames
image_sequence_folder
must be greater than or equal to sample_start_idx - 1 + n_sample_frames * sample_frame_rate
.CheckpointLoaderSimpleSequence
to generate a sequence of pictures, set n_sample_frames
>= 3.Load mask sequence from a folder.
Inputs:
Outputs:
Parameters:
image_sequence_folder
input
folder. channel
sample_start_idx
sample_frame_rate
n_sample_frames
image_sequence_folder
must be greater than or equal to sample_start_idx - 1 + n_sample_frames * sample_frame_rate
.Encode the input image sequence into a latent vector using a Variational Autoencoder (VAE) model. Also add image mask sequence to latent vector.
Inputs:
pixels: IMAGE
vae: VAE
mask_sequence: MASK_SEQUENCE
Outputs:
KSamplerSequence
.Parameters:
Generate a specific noise vector by inverting the input latent vector using the Ddim model. Usually used to improve the time consistency of the output image sequence.
Inputs:
samples: LATENT
model: MODEL
clip: CLIP
Outputs:
Parameters:
Add noise vector to latent vector.
Inputs:
samples: LATENT
noise: NOISE
Outputs:
KSamplerSequence
.Parameters:
Load the checkpoint model into UNet3DConditionModel. Usually used to generate a sequence of pictures with time continuity.
Inputs:
Outputs:
ORIGINAL_MODEL
CLIP
VAE
Parameters:
models/checkpoints
folder.Same function as LoraLoader
node, but acts on UNet3DConditionModel. Used after the CheckpointLoaderSimpleSequence
node and before the TrainUnetSequence
node. The input and output of the model are both of ORIGINAL_MODEL
type.
Fine-tune the incoming model using latent vector and context, and convert the model to inference mode.
Inputs:
samples: LATENT
model: ORIGINAL_MODEL
context: CONDITIONING
Outputs:
Parameters:
Same function as KSampler
node, but added support for noise vector and image mask sequence.
UNet3DCoditionModel has high demand for GPU memory. If you encounter out of memory error, try to reduce n_sample_frames
. However, n_sample_frames
must be greater than or equal to 3.
Some custom nodes do not support processing image sequences. The nodes listed below have been tested and are working properly: