int8-quantization Search Results

1000+ results
for int8-quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/TensorRT-Model-Optimizer #18

Ffail to quant onnx(weight stored with fp16) to int8 because…

Hi, I failed to quant onnx(weight stored with fp16) to int8 because of overflow. The following code is from modelopt.onnx.quantization.ort_patching ```python threshold = max(abs(min_value), a…

tp-nan updated 1 month ago
1
NVIDIA/TensorRT #3724

The inference speed of the int8 quantization version of SDXL…

The inference speed of the int 8 quantization version of SDXL is much slower than that of fp16. I am runing trt9.3 sdxl demo and here is the result. (I changed shape to 768x1344 manually) fp16 : pyt…

theNefelibata updated 2 weeks ago
42
NetEase-FuXi/EETQ #17

Qlora with eetq is quite slow

The training process is quite slow, whereas using 8-bit hqq speeds it up by more than tenfold. Is this normal? Or have I missed any code？ ```python import torch from transformers import EetqConfi…

hjh0119 updated 1 month ago
3
microsoft/onnxruntime #7634

int8 quantization on GPU support? (transformers)

Is there any plan to add int8 quantization support on GPU for gpt2 or other transformer models? Thanks

gyin94 updated 3 years ago
3
intel/intel-extension-for-pytorch #532

when i run intel_extension_for_pytorch.quantization import p…

### Describe the issue Now I'm replicating this [implementation，](https://intel.github.io/intel-extension-for-pytorch/llm/cpu/#compile-from-source) pytorch=2.1.0.dev20230711+cpu intel-extension-for…

duyanyao updated 2 months ago
3
tensorflow/tensorflow #60884

Model containing LSTM does not run after conversion using AC…

### System information Linux OpenSuse Tumbleweed - TensorFlow installation : pip - TensorFlow library : Tf-nightly, occurs on earlier versions too ### Code Converting a model containing an …

DerryFitz updated 2 weeks ago
9
PygmalionAI/aphrodite-engine #513

[Bug]: bnb quant load error

### Your current environment ```text Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS…

theobjectivedad updated 2 weeks ago
3
intel/intel-extension-for-pytorch #650

IPEX Fusions for BF16

I tried running ipex.optimize followed by tracing/scripting. I am not able to see any fusion groups in IR (torch.jit.last_executed_optimized_graph()). Is there any way to get the fusion groups other t…

sreelekshmyanil updated 3 weeks ago
4
tensorflow/tensorflow #62664

TFLite conversion (w/ int8 quantization) from ConcreteFuncti…

### 1. System information - OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Win 10 22H2 (but reproducible elsewhere) - TensorFlow installation (pip package or built from source): pip pack…

DLumi updated 5 months ago
12
microsoft/DeepSpeed #4023

[REQUEST] How to use int8 quantization inference without tra…

Hi, I read the docs about `zero_quant`, but it seems to require extra training. And in `deepspeed.init_inference`, the `dtype` can be set to int8, but the code does nothing for int8. https://github…

KimmiShi updated 6 months ago
4

上一页 1...7 8 9 10 11 12 13...100 下一页

1000+ results for int8-quantization

1000+ results
for int8-quantization