Enhance positional encoding adjustment in SparseCtrl loading with exp…

Kosinkadink / ComfyUI-Advanced-ControlNet

ControlNet scheduling and masking nodes with sliding context support

GNU General Public License v3.0

622 stars 60 forks source link

Enhanced the model loading process for SparseCtrl models by introducing the expected_seq_len parameter to dynamically adjust positional encoding (PE) dimensions. This improvement ensures the compatibility of positional encodings with models expecting different sequence lengths, enhancing the flexibility and usability of model loading, especially when dealing with models trained with a variety of configurations.

Changes include:

Addition of expected_seq_len to SparseSettings, allowing for customizable sequence length settings.
Implementation of dynamic adjustment for pos_encoder.pe parameters within the adjust_positional_encoding_parameters function, ensuring that the PE tensors match the expected sequence length.
Adaptation of SparseCtrlLoaderAdvanced and SparseCtrlMergedLoaderAdvanced classes to utilize the expected_seq_len setting, providing a seamless integration into the model loading workflow.

This upgrade addresses potential type mismatches and enhances the model's adaptability to different sequence lengths, streamlining the process for users and maintaining robustness across diverse model configurations.

Hey, thanks for the PR! SparseCtrl is something I've wanted to revisit for a while, so I should have time to review this sometime in the next two to three weeks.

PE adjustment is something I was thinking of exposing as one of the options for dealing with context lengths that exceed the built-in 32, but in addition, I also want to enable the use of View Options, and to add an option to make the input images 'context aware'. The context awareness would make it automatically put in one (or more) of the condhint inputs into context windows that have no condhint images if the manual indexes/spread methods don't have an explicit image to use.

For the purpsoes of PE adjustment and View Options support, I plan on copying over a bunch of changes I made in AnimateDiff-Evolved over to SparseCtrl to support it (and to make my life easier by keeping the motion module code as similar as I can between AnimateDiff-Evolved and Advanced-ControlNet), so I'm not sure if I would accept this PR when I do review it, but I do want to acknowledge that something akin to the changes you propose to work with context_lengths greater than 32 will be added one way or another!

Kosinkadink / ComfyUI-Advanced-ControlNet

Enhance positional encoding adjustment in SparseCtrl loading with exp… #83