Can this library support flash attention?

microsoft / microxcaling

PyTorch emulation library for Microscaling (MX)-compatible data formats

MIT License

164 stars 21 forks source link

Can this library support flash attention? #34

Closed ryusaeba closed 3 weeks ago

ryusaeba commented 1 month ago

As flash attention shows, I guess this library cannot support flash attention usage, since mx_mapping.inject_pyt_ops mainly affect the pytorch ops or module. Please correct me if I am wrong.

gakolhe commented 3 weeks ago

You're correct. The library won't quantize the flash-attention ops.