Hi, I have a question related to a Figure in the H3 paper.
In Figure 2, it shows performance evaluation of FlashConv against cuFFT and attention.
Is it correct to think that it's comparing all operations in H3, including qkv computation and kernel generations and not just FFTconv related operation (FFTconv + elementwise multiplication + residual computation)?
Hi, I have a question related to a Figure in the H3 paper. In Figure 2, it shows performance evaluation of FlashConv against cuFFT and attention. Is it correct to think that it's comparing all operations in H3, including qkv computation and kernel generations and not just FFTconv related operation (FFTconv + elementwise multiplication + residual computation)?