Question about discrepancy between implementations available in the repo and related papers

sylee0124 commented 1 year ago

Hi, I'm bit confused about current implementations of the repo and implementations used/discussed in related papers. I'll just state what I think is true. Please correct me if I'm wrong.

Flashconv from h3 Fused kernel is implemented at fftconv_cuda.cu but it is not using block FFT.
FlashButterfly in "Simple Hardware-Efficient Long Convolutions for Sequence Modeling" long_conv.py uses BlockFFT (which is same as Butterfly Decomposition) with support for learnable parameters for dft_matrix. But not using fused kernel and Three-pass algorithm is also not implemented.

DanFu09 commented 1 year ago

These are correct, we are cleaning up the code/algorithms for the fast block FFT and three pass algorithm across into a single package, this repository is focused on the architecture pieces for now. Will update this issue when released!

On Tue, Mar 14, 2023 at 6:52 AM Lee Seung Yul @.***> wrote:

Hi, I'm bit confused about current implementations of the repo and implementations used/discussed in related papers. I'll just state what I think is true. Please correct me if I'm wrong.

-

Flashconv from h3 Fused kernel is implemented at fftconv_cuda.cu but it is not using block FFT.

FlashButterfly in "Simple Hardware-Efficient Long Convolutions for Sequence Modeling" long_conv.py uses BlockFFT (which is same as Butterfly Decomposition) with support for learnable parameters for dft_matrix. But not using fused kernel and Three-pass algorithm is also not implemented.

— Reply to this email directly, view it on GitHub https://github.com/HazyResearch/safari/issues/5, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDDIIX33MA6IYWP65TVIALW4BEWFANCNFSM6AAAAAAV2IUUYI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

sylee0124 commented 1 year ago

Thanks for verifying :) When can I expect this performance update? Will it happen anytime soon?

DanFu09 commented 1 year ago

Hopefully soon! I’ve been traveling for a bit, but have some time to code again soon.

On Tue, Mar 14, 2023 at 8:39 AM Lee Seung Yul @.***> wrote:

Thanks for verifying :) When can I expect this performance update? Will it happen anytime soon?

— Reply to this email directly, view it on GitHub https://github.com/HazyResearch/safari/issues/5#issuecomment-1468031765, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDDIIQDGU7RRW6X2BNZCX3W4BRJXANCNFSM6AAAAAAV2IUUYI . You are receiving this because you commented.Message ID: @.***>

HazyResearch / safari

Question about discrepancy between implementations available in the repo and related papers #5

Flashconv from h3 Fused kernel is implemented at fftconv_cuda.cu but it is not using block FFT.