-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS…
-
Hi, I read the docs about `zero_quant`, but it seems to require extra training.
And in `deepspeed.init_inference`, the `dtype` can be set to int8, but the code does nothing for int8. https://github…
-
System information
Linux 20.04
pip Tensorflow==2.12.0
using tranformers WhisperForConditionalgeneration
I'm trying to convert from TF to tflite and quantized to int8 Whisper, using the whisper …
-
### Issue type
Feature Request
### Have you reproduced the bug with TensorFlow Nightly?
No
### Source
binary
### TensorFlow version
v2.13.0-17-gf841394b1b7
### Custom code
No
### OS platform…
-
I tried running IntelAI DLRM model with int8 precision with default int8_configure.json. Could someone clarify if quantization happens each time the inference_performance.sh script is triggered, or if…
-
Hello,
I'm training an object detection model based on a custom dataset and I'm not sure about the way to use transfer learning (i.e. fine tune an already existing model with our data).
I'm fol…
-
Prerequest
- [x] Introduce data type for int8 quantization
NYI cpu backend kernel: int8 quantization type
- [x] Add #5509
- [x] ArgMax
- [x] ArgMin
- [x] AvgPool2D #5593
- [ ] BatchMatmul
- …
-
I tried running ipex.optimize followed by tracing/scripting. I am not able to see any fusion groups in IR (torch.jit.last_executed_optimized_graph()). Is there any way to get the fusion groups other t…
-
Supporting int8 quantized models is essential for mobile scenarios and in many NPU architectures. TensorFlow (Lite) and ONNX, for instances, have int8 quantization support built-in, and WebNN should t…
-
Hello, I build Bolt(tag: v1.5.1) with the linux-x86_64_avx512 version, and convert onnx model to PTQ version by X2bolt.Then try post_training_quantization to quantize it to int8 precision. I follow th…