Kosinkadink / ComfyUI-Advanced-ControlNet

ControlNet scheduling and masking nodes with sliding context support
GNU General Public License v3.0
622 stars 60 forks source link

Enhance positional encoding adjustment in SparseCtrl loading with exp… #83

Open nazgut opened 8 months ago

nazgut commented 8 months ago

Enhanced the model loading process for SparseCtrl models by introducing the expected_seq_len parameter to dynamically adjust positional encoding (PE) dimensions. This improvement ensures the compatibility of positional encodings with models expecting different sequence lengths, enhancing the flexibility and usability of model loading, especially when dealing with models trained with a variety of configurations.

Changes include:

This upgrade addresses potential type mismatches and enhances the model's adaptability to different sequence lengths, streamlining the process for users and maintaining robustness across diverse model configurations.

Kosinkadink commented 8 months ago

Hey, thanks for the PR! SparseCtrl is something I've wanted to revisit for a while, so I should have time to review this sometime in the next two to three weeks.

PE adjustment is something I was thinking of exposing as one of the options for dealing with context lengths that exceed the built-in 32, but in addition, I also want to enable the use of View Options, and to add an option to make the input images 'context aware'. The context awareness would make it automatically put in one (or more) of the condhint inputs into context windows that have no condhint images if the manual indexes/spread methods don't have an explicit image to use.

For the purpsoes of PE adjustment and View Options support, I plan on copying over a bunch of changes I made in AnimateDiff-Evolved over to SparseCtrl to support it (and to make my life easier by keeping the motion module code as similar as I can between AnimateDiff-Evolved and Advanced-ControlNet), so I'm not sure if I would accept this PR when I do review it, but I do want to acknowledge that something akin to the changes you propose to work with context_lengths greater than 32 will be added one way or another!