Hello, thank you for doing such a great job. Regarding the initialization of parameters in convpass, why should the weight and bias of the adapter_down be initialized to 0? In other related papers, such as Lora, adapterformer, they mentioned that this is done to ensure stability during initial training. But if parameters are initialized to 0, This should result in all the calculation results of the convpass branch being 0 so that the convpass we added will not have any impact on the training of the model. Looking forward to receiving your reply!
Hello, thank you for doing such a great job. Regarding the initialization of parameters in convpass, why should the weight and bias of the adapter_down be initialized to 0? In other related papers, such as Lora, adapterformer, they mentioned that this is done to ensure stability during initial training. But if parameters are initialized to 0, This should result in all the calculation results of the convpass branch being 0 so that the convpass we added will not have any impact on the training of the model. Looking forward to receiving your reply!