quant Search Results - Githubissues

1000+ results
for quant

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

airockchip/rknn-llm #121

模型转换报错： Catch exception when loading dataset

# 错误信息： (rkllm) python rkllm-toolkit/examples/test.py INFO: rkllm-toolkit version: 1.1.2 The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored. Loa…

sdrzmgy updated 2 hours ago
1
Xilinx/brevitas #1091

RuntimeError: Module <class 'brevitas.proxy.float_runtime_qu…

**Describe the bug** Attempting to save PTQ `TorchVision` models using the `ptq_benchmark_torchvision.py` script after amending the script to save the model using `export_torch_qcdq` as a final ste…

jcollyer-turing updated 2 days ago
9
Nesvilab/FragPipe #1877

Numeric ordering of column in quantitative tables

Hi, When we assign numbers in the Experiment column of the manifest, 1, 2, 3 ...10,11,12,...20,21,22...., the order that the samples appear in a quant table will be 1, 10, 11....2, 20, 21...etc. Is…

mwfoster updated 1 day ago
2
Xilinx/Vitis-AI #1460

PermissionError: [Errno 13] Permission denied: 'quantize_res…

Hey, I am learning how to use Vitis AI 3.0 and trying to run the Quickstart tutorial for Vitis AI 3.0 `VCK190` resnet18. At the Section of the "Pytorch turorial" : ` Step 7 : Next, let’s run…

MeltedHyperion07 updated 2 weeks ago
1
onnx/neural-compressor #38

How do I target NPU for quantization?

I can provide the execution provider like this: config.StaticQuantConfig(calibration_data_reader=data_reader, quant_format=QuantFormat.QOperator, execution_provider="DmlExecutionProvider"), but there…

kleiti updated 4 days ago
3
pytorch/torchchat #1298

RuntimeError: CUDA error: named symbol not found

### 🐛 Describe the bug python torchchat.py generate stories110M --quant torchchat/quant_config/cuda.json --prompt "It was a dark and stormy night, and" Using device=cuda Tesla T4 Loading model...…

mikekgfb updated 1 week ago
3
casper-hansen/AutoAWQ #558

Quantitative model report wrong, RuntimeError: Expected all …

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_bmm) ```python from aw…

ShelterWFF updated 2 weeks ago
10
turboderp/exllamav2 #670

[QUESTION] Does exllamav2 support no-dequant inference?

### Problem Given a quant model (for example llama2-7B-nf4), vanilla inference is to dequant the model to fp16 or bf16 to compute, does exllamav2 support no-dequant inference?

AaronZLT updated 1 week ago
1
NVIDIA/TensorRT #3762

why the yolov8 int8 quant using pytorch_quant is slower than…

devicec : nvidia NX 1.using trt --fp16 ` /usr/src/tensorrt/bin/trtexec --onnx=best.onnx --workspace=4096 --saveEngine=best.engine --fp16` the result of infer speed is 36.8ms 2. using pytorch_qua…

luoshiyong updated 5 months ago
6
vdemichev/DiaNN #1201

Separate Library generation vs all in one.

Hello! I am running DiaNN 1.8 on Windows and I noticed something odd, and I was wondering if there was an explanation for it. I basically have two runs. All in One: Running all the samples at once …

jsnedeco updated 1 month ago
3

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for quant

1000+ results
for quant