Closed Qubitium closed 1 month ago
Hi @Qubitium,
Thank you for your great work on GPTQModel.
1 Since you are using a different API, to avoid confusing users, we could add a community section and link to your README instead. What do you think?
2 After the release of auto-round v0.3, we intend to set the auto-round format as the default to support a unified API for CPU, HPU, and CUDA. However, due to certain constraints, we were unable to pack the CUDA kernels (referred to as v2 in your terms) in our package. Therefore, I was wondering if you could consider splitting the kernel part into a separate Git repository, similar to what autoawq did.
GPTQModel has fully integrated AutoRound since v0.9.6. This PR add refence to GPTQModel for both quantization step using AutoRound and inference.