int8-quantization Search Results

1000+ results
for int8-quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

PygmalionAI/aphrodite-engine #513

[Bug]: bnb quant load error

### Your current environment ```text Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS…

theobjectivedad updated 2 weeks ago
3
microsoft/DeepSpeed #4023

[REQUEST] How to use int8 quantization inference without tra…

Hi, I read the docs about `zero_quant`, but it seems to require extra training. And in `deepspeed.init_inference`, the `dtype` can be set to int8, but the code does nothing for int8. https://github…

KimmiShi updated 6 months ago
4
tensorflow/tensorflow #61695

Segmentation Fault (Core Dumped) when convert whisper with i…

System information Linux 20.04 pip Tensorflow==2.12.0 using tranformers WhisperForConditionalgeneration I'm trying to convert from TF to tflite and quantized to int8 Whisper, using the whisper …

SantiagoMoreno-UdeA updated 3 months ago
12
tensorflow/tensorflow #62923

TFLite Converter, add possibility to ignore some OPs from qu…

### Issue type Feature Request ### Have you reproduced the bug with TensorFlow Nightly? No ### Source binary ### TensorFlow version v2.13.0-17-gf841394b1b7 ### Custom code No ### OS platform…

adamp87 updated 3 months ago
4
intel/models #158

DLRM Quantization

I tried running IntelAI DLRM model with int8 precision with default int8_configure.json. Could someone clarify if quantization happens each time the inference_performance.sh script is triggered, or if…

Vasanta1 updated 3 months ago
1
STMicroelectronics/stm32ai-modelzoo #40

Object detection - How to use transfer learning ?

Hello, I'm training an object detection model based on a custom dataset and I'm not sure about the way to use transfer learning (i.e. fine tune an already existing model with our data). I'm fol…

cypamigon updated 2 weeks ago
7
Samsung/ONE #4664

[onert] NYI CPU backend kernel - int8 quantization

Prerequest - [x] Introduce data type for int8 quantization NYI cpu backend kernel: int8 quantization type - [x] Add #5509 - [x] ArgMax - [x] ArgMin - [x] AvgPool2D #5593 - [ ] BatchMatmul - …

hseok-oh updated 3 years ago
6
intel/intel-extension-for-pytorch #650

IPEX Fusions for BF16

I tried running ipex.optimize followed by tracing/scripting. I am not able to see any fusion groups in IR (torch.jit.last_executed_optimized_graph()). Is there any way to get the fusion groups other t…

sreelekshmyanil updated 3 weeks ago
4
webmachinelearning/webnn #128

WebNN should support int8 quantized models

Supporting int8 quantized models is essential for mobile scenarios and in many NPU architectures. TensorFlow (Lite) and ONNX, for instances, have int8 quantization support built-in, and WebNN should t…

wchao1115 updated 3 months ago
10
huawei-noah/bolt #137

Can't convert model to int8 precision with post_training_qua…

Hello, I build Bolt(tag: v1.5.1) with the linux-x86_64_avx512 version, and convert onnx model to PTQ version by X2bolt.Then try post_training_quantization to quantize it to int8 precision. I follow th…

jerrydyc updated 12 months ago
3

上一页 1...8 9 10 11 12 13 14...100 下一页

1000+ results for int8-quantization

1000+ results
for int8-quantization