Closed ChenJunhao-Fighting closed 6 months ago
Does the exchange of parameters C require that the L of the two modes be equal? For example, sequences BL1D and BL2D cannot be exchanged?
Hi, the input shapes are expected to be the same. For different input shapes, you can either use a linear layer before sending to SSMs, or change the linear layer here.
Does the exchange of parameters C require that the L of the two modes be equal? For example, sequences BL1D and BL2D cannot be exchanged?