Open PixelChen24 opened 2 months ago
Hi, Pixel, an observation solution of Eq.(18) is as follows: We first need to make $W_t^dxt=W{t+1}^dx_t$. If it is achieved, we next need to make $W_t^uyt=W{t+1}^uy_t$. Thus, you can see that $y_t=W_t^dxt=W{t+1}^dx_t$ is guaranteed by making $W_t^dxt=W{t+1}^dx_t$, which equals to make $\DeltaW^dx_t=0$, one of the aim of our method.
Hi, Pixel, an observation solution of Eq.(18) is as follows: We first need to make W t d x t = W t + 1 d x t . If it is achieved, we next need to make W t u y t = W t + 1 u y t . Thus, you can see that y t = W t d x t = W t + 1 d x t is guaranteed by making W t d x t = W t + 1 d x t , which equals to make \DeltaW d x t = 0 , one of the aim of our method.
Why do we need $W_t^dxt=W{t+1}^dx_t$ to satisfy eq18? Is it the Necessary condition or Sufficient condition of eq18?
Hi, Pixel, an observation solution of Eq.(18) is as follows: We first need to make W t d x t = W t + 1 d x t . If it is achieved, we next need to make W t u y t = W t + 1 u y t . Thus, you can see that y t = W t d x t = W t + 1 d x t is guaranteed by making W t d x t = W t + 1 d x t , which equals to make \DeltaW d x t = 0 , one of the aim of our method.
Why do we need W t d x t = W t + 1 d x t to satisfy eq18? Is it the Necessary condition or Sufficient condition of eq18?
I'm foolish, and can you provide an inversed chain to explain it step by step for me? That is to say, if $w_d^dxt=W{t+1}^dx_t$ can be achieved, how can it lead to eq18?
Well, you can see that in Eq.(18) ($W_t^uW_t^dxt=W{t+1}^uW_{t+1}^dx_t$), there is a corresponding relationship between the terms on the left and right sides of the equation. For example, $Wt^u$ corresponds to $W{t+1}^u$, $Wt^d$ corresponds to $W{t+1}^d$, and $x_t$ corresponds to $x_t$. Thus, it is just like a match game to realize the equation. Obviously, $x_t$ equals to $x_t$. Next, we hope to achieve $W_t^dxt=W{t+1}^dx_t$. In this way, we obtain the first equation in Eq.(22). Subsequently, we assume that we have already realized $W_t^dxt=W{t+1}^dx_t$, so we can command $y_t=W_t^dxt=W{t+1}^dx_t$ and replace $W_t^dxt$ and $W{t+1}^dx_t$ with $y_t$. Therefore, Eq.(18) can be written as $W_t^uyt=W{t+1}^uy_t$ with $y_t$. Similarly, like the way to solve $W_t^dxt=W{t+1}^dx_t$, we can obtain the second equation in Eq.(22). In summary, if we can realize the two equations in Eq.(22), we can achieve Eq.(18)
Process about how to address $W_t^dxt=W{t+1}^dx_t$, please refer to the way in Sec 3.2 Self-Attention Based Gradient Projection Method
Hi, I've read your latest journal work
Gradient Projection For Continual Parameter-Efficient Tuning
and confused about your deduction.How to get the equation $$ y_t=W_t^d xt=W{t+1}^d x_t $$ ? Why does $W_t^d xt=W{t+1}^d x_t$?
Look forward to your response, thank you very much!