Ref GPTQModel for both quant and inference

Qubitium commented 3 months ago

GPTQModel has fully integrated AutoRound since v0.9.6. This PR add refence to GPTQModel for both quantization step using AutoRound and inference.

wenhuach21 commented 3 months ago

Hi @Qubitium,

Thank you for your great work on GPTQModel.

1 Since you are using a different API, to avoid confusing users, we could add a community section and link to your README instead. What do you think?

2 After the release of auto-round v0.3, we intend to set the auto-round format as the default to support a unified API for CPU, HPU, and CUDA. However, due to certain constraints, we were unable to pack the CUDA kernels (referred to as v2 in your terms) in our package. Therefore, I was wondering if you could consider splitting the kernel part into a separate Git repository, similar to what autoawq did.

wenhuach21 commented 1 month ago

https://github.com/intel/auto-round/pull/266

intel / auto-round

Ref GPTQModel for both quant and inference #196