About the training parameters of Spatial Lora: Recursive ones?

showlab / MotionDirector

[ECCV 2024 Oral] MotionDirector: Motion Customization of Text-to-Video Diffusion Models.

Apache License 2.0

850 stars 54 forks source link

I found an interesting implementation in your codes: In Line 644-674 of MotionDirector_train.py, the spatial_lora is added for each video, it will lead to a result that the Linear layer of the LoraInjectedLayer will be recursively transformed into a LoraInjectedLayer. It will lead to a process like (if two videos are used for training):

$$ Linear_2[Linear_1(x) + (l_1^u l_1^d(x))] + l_2^u l_2^d [(Linear_1(x) + (l_1^u l_1^d(x)))] ... $$

What is the motivation? Can I only inject only one time? If the number of videos are 100, 1000, ..., won't it cause some problems?

Thanks.

showlab / MotionDirector

About the training parameters of Spatial Lora: Recursive ones? #38