Open tjtanaa opened 1 year ago
Hi, @tjtanaa I wonder how the roadmap is going on. I quite excited to use AWQ quantized format, when can it be supported?
Hi, @tjtanaa I wonder how the roadmap is going on. I quite excited to use AWQ quantized format, when can it be supported?
@HAN-oQo Hi, vLLM authors said they are working on more efficient AWQ implementation on triton. So, we will address the AWQ on ROCm after they have released their new kernel.
Thank you for answer! @tjtanaa I also wonder why safetensor format is not supported, and do you have a plan to support it!
Thank you for offering the nice project.
Thank you for answer! @tjtanaa I also wonder why safetensor format is not supported, and do you have a plan to support it!
Thank you for offering the nice project.
@HAN-oQo The loading of safetensors is buggy on ROCm platform. The memory management during loading of safetensors might be causing the issue on ROCm platform. It often encounters this issue when tensor-parallelism is larger than 1; however, loading from pt
is totally fine.