Availability of AMD specific quantization tool for Llama2?

amd / RyzenAI-SW

MIT License

369 stars 63 forks source link

Availability of AMD specific quantization tool for Llama2? #103

Open AshimaBisla opened 3 months ago

AshimaBisla commented 3 months ago

Hello, While applying quantization on Llama model, we first convert weights downloaded from Meta and then use huggingface converter and then apply huggingface compatible AWQ quantization.

Is there a quantization tool specific to AMD, where the dependency on huggingface is removed?

Thanks, Ashima

uday610 commented 3 months ago

Are you trying to convert PyTorch model to ONNX model? Then yes, today we use Hugging Face converter. I will check about the possible other option for future and update.