[Usage]: Do you have any plans to support sparse fp8 kernel and support on rocm?

neuralmagic / nm-vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

https://nm-vllm.readthedocs.io

Other

251 stars 10 forks source link

Closed DehuaTang closed 3 months ago

DehuaTang commented 3 months ago

The output of `python collect_env.py`

Great job ! Do you have any plans to support sparse fp8 kernel and support it on rocm?

mgoin commented 3 months ago

It isn't a plan at the moment but I could see us implementing it in the future if there is a use-case