Closed PanYuQi66666666 closed 2 months ago
Hi, we performed the sum operation during the forward expansion to allow parallelization and it does not affect the final result. In contrast, during the backward, each group is treated separately.
Hi, I'm assuming the aspect was clarified, feel free to re-open the issue if that's not the case
Regarding the Block Static Expansion mentioned in the paper, is it extending the input sequence to lengths of {32, 64, 128, 256, 512} and then performing calculations. But what I see in the code is adding up {32, 64, 128, 256, 512} to get 992. I am very confused, please provide me with a detailed answer to this part. thank you