ptq Search Results - Githubissues

NVIDIA/TensorRT-Model-Optimizer #91

calib_dataloader created fail

I am trying the vlm_ptq by following the readme in vlm_ptq folder, and when I call a command "scripts/huggingface_example.sh --type llava --model llava-1.5-7b-hf --quant fp8 --tp 8", (--deployment com…

relaxtheo updated 2 weeks ago

NVIDIA/TensorRT-Model-Optimizer #90

unrecognized arguments: --deployment

relaxtheo updated 2 weeks ago

Xilinx/brevitas #1091

RuntimeError: Module <class 'brevitas.proxy.float_runtime_qu…

**Describe the bug** Attempting to save PTQ `TorchVision` models using the `ptq_benchmark_torchvision.py` script after amending the script to save the model using `export_torch_qcdq` as a final ste…

jcollyer-turing updated 5 days ago

OscarSavolainenDR/Quantization-Tutorials #13

can I apply torch eager mode or fx to external SOTA PTQ/QAT …

Hi, I love your great tutorials. I have studied many SOTA PTQ papers for ViT like I-ViT but I found all PTQ papers are based on simulation (FakeQ) I want to deploy that kind of external PTQ implemen…

dedoogong updated 3 days ago

pytorch/executorch #6655

How To Building and Running Llama 3.2 1B Instruct with Qualc…

### Right Case When I follow the doc : https://github.com/pytorch/executorch/blob/main/examples/models/llama/README.md#enablement, I export the Llama3.2-1B-Instruct:int4-spinquant-eo8 model to xnnpa…

baotonghe updated 2 weeks ago

pytorch/executorch #6212

Support QAT in QCOM qnn backend

### 🚀 The feature, motivation and pitch Currently qnn quantizer only supports PTQ (post training quantization), and we'd like to enable QAT (quantization aware trainning) for better quantization supp…

cccclai updated 1 month ago

Deci-AI/super-gradients #2060

yolo nas Post training quantization and quantization awarene…

### 💡 Your Question I have followed exactly same steps for model training followed by PTQ and QAT mentioned in the offcial super-gradient notebook : https://github.com/Deci-AI/super-gradients/blob…

anazkhan updated 2 weeks ago

microsoft/Olive #1439

BERT has not final model

**Describe the bug** I tried to optimize BERT model with bert_ptq_cpu.json but it gave 7 output models. It there any ways or change the config to get only one output model? ``` [2024-10-25 10:54:59,1…

dangokuson updated 3 weeks ago

alibaba/TinyNeuralNetwork #359

Model with stack does not work with int8 target type

Converting this dummy model with quantize_target_type="int8" and per_tensor=True throws an error in tflite ```python import torch.nn as nn import torch from tinynn.graph.quantization.quantizer …

spacycoder updated 1 month ago

pytorch/pytorch #140205

Get `aot_autograd`'ed graph without `torch.compile` and free…

### 🚀 The feature, motivation and pitch I am trying to implement eager mode of PT2E quantization on CPU. Currently, the PT2E quantization on CPU is lowered to Inductor by `torch.compile`. The current…

Xia-Weiwen updated 5 days ago

1000+ results for ptq

1000+ results
for ptq