model-quantization Search Results

1000+ results
for model-quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ibab/tensorflow-wavenet #293

Model quantization available?

Hi, Although current generate.py use fast WaveNet generation algorithm, but is still too slow. Is it available for the network to be quantized? But in TF tutorial, it said we need to specify …

weixsong updated 6 years ago
1
ROCm/AMDMIGraphX #3298

[Issue]: Investigate and Fix GPU error with int8 reduced lay…

### Problem Description Seeing GPU fault when running the onnxruntime-inference-examples script using reduced layer bert models during benchmarking. It appears quantization/calibration steps work …

TedThemistokleous updated 7 hours ago
3
nota-github/MLC-VLM-template #6

TVMError: Check failed (partial_update) is false: Key "image…

Hi, I've been following your tutorial to compile Llava 1.5 7B VLM and I was able to compile everything successfully. However, when I run the app, I get the following error: ![image](https://github.…

NSTiwari updated 1 week ago
9
intel/xFasterTransformer #480

Qwen2.5-0.5B-Instruct quantization with gptq error

xft version：1.8.2 lscpu： Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 52 bits physical, 48 bits virtual Byte Order: Little End…

wcollin updated 1 week ago
1
google-ai-edge/ai-edge-torch #137

Tensor Shape Mismatch During TFLite Quantization Conversion

### Description of the bug: Hello, I'm encountering an issue when trying to export a model to tflite with quantization. It appears that the tensor shapes are being altered incorrectly somewher…

spacycoder updated 1 month ago
6
pytorch/ao #1057

How to use float8 with SM89 hardware - i.e. NVIDIA A6000 ADA…

I am running torchao: 0.5 and torch: '2.5.0a0+b465a5843b.nv24.09' on an NVIDIA A6000 ADA card (sm89) which supports FP8. I ran the generate.py code from the benchmark: python generate.py --c…

vgoklani updated 1 day ago
2
casper-hansen/AutoAWQ #566

about the shape of qzeros in awq quantization model

@casper-hansen Hi, I have a question about the awq quantization model on HuggingFace, [https://huggingface.co/TheBloke/Llama-2-7B-AWQ/tree/main?show_file_info=model.safetensors](url). The shapes o…

MuYu-zhi updated 2 months ago
2
HighCWu/flux-4bit #4

Is there any quantification code available?

Hello, thank you for your work, it helps me a lot. But the current video memory still puts me under pressure. I wonder if it can be further quantified, such as 2bit? Can you provide a referenc…

CrushDemo01 updated 6 days ago
2
vllm-project/vllm #9240

Questions about the inference performance of the GPTQ model

**Why is it that when using a quantitative model for inference, the TTFT optimization is not obvious, but the overall inference efficiency is improved a lot? At the same time, the inference efficiency…

Rssevenyu updated 1 day ago
3
StartHua/Comfyui_CXH_joy_caption #22

没法运行

Error occurred when executing Joy_caption_load: No package metadata was found for bitsandbytes File "E:\ComfyUI-aki-v1.3\execution.py", line 317, in execute output_data, output_ui, has_subgraph…

chenminglin updated 1 month ago
4

上一页 1...9 10 11 12 13 14 15...100 下一页

1000+ results for model-quantization

1000+ results
for model-quantization