-
While running this example:
```
$ cd TensorRT-Model-Optimizer/llm_ptq
$ scripts/huggingface_example.sh --type llama --model $model --quant fp8 --tp 2
```
there was a non-fatal failure:
```
[8ad0971d…
-
When I try to ptq mobilenetv2, I get an error: "ImportError: cannot import name 'ConvBNReLUFusion' from 'torch.quantization.fx.fusion_patterns'"
-
Hi all,
I have been trying to apply **post-training-quantization** to a custom vision model (pretrained vgg16 model) which I have already finetuned using "xpu" (Intel GPU Max Series). I have saved …
-
您好,对您的工作表示祝贺,实在太赞了。我是名刚接触量化的萌新,最近需要实现低比特的后训练量化PTQ,由于知识有限,有两个疑问非常希望得到您的解答:
1.是只有iao才支持PTQ吗?
2.readme提到的“加载剪枝后的模型再做量化”中的量化指的是QWT还是PTQ呢?
2.我期望做低精度的PTQ(2~5bit),请问可以迁移您的工作以实现吗?还是说PTQ 只支持8bit的量化呢?
再次感谢您…
-
是 python --weights xxx --include onnx --dynamic吗,我这样导出的onnx,在ptq中用不了,请问我该如何正确导出准备工作中的onnx
-
Here we keep track of what part of `quantize` in `ptq_common.py` are tested and what are still missing.
-
When I runned the [ptq.py](https://github.com/open-mmlab/mmrazor/blob/main/tools/ptq.py), Unfortunately It threw an error, and the error message is as follows. The reason for the error is most likely …
-
Issue to track areas with missing tests that should be added.
- [ ] Handling of scaling_min_val combined with different restrictions on scale factors.
- [x] Calibration of activations scale factor…
-
onnx_ptq/evaluate_vit.py error: ValueError: Runtime TRT is not supported.
![企业微信截图_17225064037714](https://github.com/user-attachments/assets/b1ad1ffc-9744-46ac-8d2e-ed6aeb5584a2)
-
Is it possible to add to https://nvidia.github.io/TensorRT-LLM/ the code copy widget that you already have on https://nvidia.github.io/TensorRT-Model-Optimizer/?
For example if you go to https://nvidi…