Open yongchaoding opened 2 weeks ago
as we all know that lmdelopy runs fastest in awq w4a16, however, as fp8 is used in lots of place. so i wonder, if developers has any plan to develop a fastest w4a8-fp8 kernel in lmdeploy?
No response
+1
I will start the work on W8A8 after my current work is done. W4A8 should come after W8A8.
Motivation
as we all know that lmdelopy runs fastest in awq w4a16, however, as fp8 is used in lots of place. so i wonder, if developers has any plan to develop a fastest w4a8-fp8 kernel in lmdeploy?
Related resources
No response
Additional context
No response