-
I am trying to Quantize the whole model but whenever, I try to load the model using quantized scope it gives me error like this
```
import sys, os
import numpy as np
import tensorflow as tf
fro…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Current Behavior
![image](https://github.com/THUDM/ChatGLM2-6B/assets/66857797/ea9537ae-5002-4640-bbce-e…
-
### Your current environment
...
### How would you like to use vllm
I have downloaded a model. Now on my 4 GPU instance I attempt to quantize it using AutoAWQ.
Whenever I run the script below I ge…
-
I have the following error when i run this model. Can you tell me which pytorch version to use to run this model.
I have the following error
/home/nvidia/.local/lib/python3.8/site-packages/torch…
-
Hi again,
I've successfully quantized an onnx model to int8, then converted to tensorrt engine and noticed the performance increase compared to fp16.
```bash
python -m modelopt.onnx.quantizati…
-
# Model Request
### Which model would you like to see in the model zoo?
A quantized MobileNet (doesn't matter which version) could be fine. TensorFlow has published end to end quantized [MobileNet…
-
I have downloaded a model. Now on my 4 GPU instance I attempt to quantize it using AutoAWQ.
Whenever I run the script below I get 0% GPU utilization.
Can anyone assist why can this be happening?
…
-
Hi,
Is there any random number involved in the quantization process except the random value in stochastic rounding scheme?
I ask because I noticed that if I turn on quantization and feed a batch…
-
### Search before asking
- [X] I have searched the Ultralytics YOLO [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report.
### Ultralytics YOLO Component
Pred…
-
### OpenVINO Version
2024.0.0
### Operating System
Ubuntu 20.04 (LTS)
### Device used for inference
None
### OpenVINO installation
PyPi
### Programming Language
Python
### Hardware Architect…