FlagOpen / FlagGems

FlagGems is an operator library for large language models implemented in Triton Language.
Apache License 2.0
347 stars 49 forks source link

[Operator] Add radix-2 fft #205

Closed Bowen12992 closed 1 month ago

Bowen12992 commented 2 months ago

PR Category

Operator

Type of Change

New Feature

Description

This PR implement a native FFT with Cooley-Tukey method,current it only support radix-2 The performance is far worse than pytorch aten library (which use CuFFT library actually) for some reasons

Issue

Progress

Performance

benchmark/test_special_perf.py Operator fft.fft Performance Test (torch.complex64)
Size    Torch Latency (ms)    Gems Latency (ms)    Gems Speedup
---------------------------------------------------------------
1024               0.00512              0.04096           0.125
11264             0.012288             0.082944           0.148
21504              0.01536             0.102144            0.15
31744             0.014336             0.100352           0.143
41984              0.01536              0.11776            0.13
52224             0.017408             0.115712            0.15
62464             0.016384              0.11776           0.139
72704             0.017408             0.131072           0.133
82944             0.014496             0.131072           0.111
93184             0.018432             0.131136           0.141
Bowen12992 commented 1 month ago

Since fft is difficult for triton now, i checkout a new branch named fft_dev and close this PR