[Tools] Add convert tool for Llama models quantized by AutoGPTQ

intel / xFasterTransformer

Apache License 2.0

374 stars 65 forks source link

[Tools] Add convert tool for Llama models quantized by AutoGPTQ #276

Closed xiangzez closed 7 months ago

Duyi-Wang commented 7 months ago

It's preferable to an additional param for quantification rather than creating a new converter, as this approach allows us to integrate new quantification methods in the future, like AWQ

Duyi-Wang commented 7 months ago

And update the description of Converter in README.

Duyi-Wang commented 7 months ago

How about quantization:Optional[str]= "gpqt"? use_gptq is inconvenient to add new types in future

xiangzez commented 7 months ago

For documentation, I think we need a separate tutorial page for quantization. @miaojinc already wrote a doc and we should update this doc and put it in main branch.

miaojinc commented 7 months ago

For documentation, I think we need a separate tutorial page for quantization. @miaojinc already wrote a doc and we should update this doc and put it in main branch.

Sure, I will update the doc after this PR merged.

Duyi-Wang commented 7 months ago

For documentation, I think we need a separate tutorial page for quantization. @miaojinc already wrote a doc and we should update this doc and put it in main branch.

Sure, I will update the doc after this PR merged.

@miaojinc Could you add it to our docs, source code is under docs branch.

miaojinc commented 7 months ago

For documentation, I think we need a separate tutorial page for quantization. @miaojinc already wrote a doc and we should update this doc and put it in main branch.

Sure, I will update the doc after this PR merged.

@miaojinc Could you add it to our docs, source code is under docs branch.

Yes, sure. I will do that in a new pull request for quantization document.

changqi1 commented 7 months ago

@xiangzez CI failed on baichuan model?

xiangzez commented 7 months ago

@changqi1 CI issue should be fixed in #287

Duyi-Wang commented 7 months ago

@changqi1 CI issue should be fixed in #287

rebase or merge main branch？

changqi1 commented 7 months ago

We could rebase to check this PR's status.