Open zhangxiao-stack opened 8 months ago
Is there a Flash-Decoding algorithm implemented based on Triton?
IIUC, lightllm has implemented a flash-decoding triton kernel. Maybe you can refer it.
Is there a Flash-Decoding algorithm implemented based on Triton?