[REQUEST] Sparse Attention FP32 support

microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

https://www.deepspeed.ai/

Apache License 2.0

35.31k stars 4.09k forks source link

Open chutaklee opened 2 years ago

chutaklee commented 2 years ago

assert query.dtype == torch.half, "sparse attention only supports training in fp16 currently, please file a github issue if you need fp32 support"

Any update on fp32 support for spase attention? Currently I'm happy with training sparse model on fp16 but it seems to me fp32 is more versatile.

I can work on this but I'm not familiar with triton and gpu programming. Could you point out some hints at least?

alexanderswerdlow commented 2 years ago

Bumping this! I would really appreciate fp32 support!