Closed miaojinc closed 2 months ago
My concern is that the packages in requirements.txt may trigger some issues during security checking, let's target for next version.
My concern is that the packages in requirements.txt may trigger some issues during security checking, let's target for next version.
Yes, it has some potential issues. The reason of the packages version is because we lacks the group quantization operators, so we have to quantize the models on CPU according to our kernel. If we can align with autoawq and autogptq, we can load the quantized weights directly with out any modification for autoawq and autogptq.
Closed, has been merged in https://github.com/intel/xFasterTransformer/commit/59a9430d4ee2ca99de4ca4ea78b9f3eba868e900.
update content related to Convert Python API, add quantization options.