Open nazgut opened 8 months ago
Hey, thanks for the PR! SparseCtrl is something I've wanted to revisit for a while, so I should have time to review this sometime in the next two to three weeks.
PE adjustment is something I was thinking of exposing as one of the options for dealing with context lengths that exceed the built-in 32, but in addition, I also want to enable the use of View Options, and to add an option to make the input images 'context aware'. The context awareness would make it automatically put in one (or more) of the condhint inputs into context windows that have no condhint images if the manual indexes/spread methods don't have an explicit image to use.
For the purpsoes of PE adjustment and View Options support, I plan on copying over a bunch of changes I made in AnimateDiff-Evolved over to SparseCtrl to support it (and to make my life easier by keeping the motion module code as similar as I can between AnimateDiff-Evolved and Advanced-ControlNet), so I'm not sure if I would accept this PR when I do review it, but I do want to acknowledge that something akin to the changes you propose to work with context_lengths greater than 32 will be added one way or another!
Enhanced the model loading process for SparseCtrl models by introducing the
expected_seq_len
parameter to dynamically adjust positional encoding (PE) dimensions. This improvement ensures the compatibility of positional encodings with models expecting different sequence lengths, enhancing the flexibility and usability of model loading, especially when dealing with models trained with a variety of configurations.Changes include:
expected_seq_len
to SparseSettings, allowing for customizable sequence length settings.pos_encoder.pe
parameters within theadjust_positional_encoding_parameters
function, ensuring that the PE tensors match the expected sequence length.SparseCtrlLoaderAdvanced
andSparseCtrlMergedLoaderAdvanced
classes to utilize theexpected_seq_len
setting, providing a seamless integration into the model loading workflow.This upgrade addresses potential type mismatches and enhances the model's adaptability to different sequence lengths, streamlining the process for users and maintaining robustness across diverse model configurations.