Open xiguadong opened 2 weeks ago
Hello , in the https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int4/blob/c34a4a91629f09f73a285f32dbd26106b033c654/config.json#L29 has mentioned the groupsize is 128 for 4bit or 8bit. So could you tell me the groupsize for this model?
And If I want to deploy the official 4bit model to QNN, how shuold I do?
thanks
The Qwen on AI Hub Models is Qwen 2.0. The block group size is 64.
If using our provided model, you can deploy it using the tutorial: https://github.com/quic/ai-hub-apps/tree/main/tutorials/llm_on_genie
Hello , in the https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct-GPTQ-Int4/blob/c34a4a91629f09f73a285f32dbd26106b033c654/config.json#L29 has mentioned the groupsize is 128 for 4bit or 8bit. So could you tell me the groupsize for this model?
And If I want to deploy the official 4bit model to QNN, how shuold I do?
thanks