[Feature] Is AWQ W4Afp8 supported?

Checklist

[x] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
[x] 2. Please use English, otherwise it will be closed.

Motivation

AWQ with INT4 weights and fp8 activations / KV cache works fairly well with Llama-3 models, and is a useful quantization technique for high-throughput regime. Is this quantization format supported by SGLang?

Related resources

https://github.com/NVIDIA/TensorRT-LLM/blob/b7868dd1bd1186840e3755b97ea3d3a73ddd76c5/examples/falcon/README.md?plain=1#L311

sgl-project / sglang

[Feature] Is AWQ W4Afp8 supported? #1964

Checklist

Motivation

Related resources