int8-quantization Search Results

1000+ results
for int8-quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

openvinotoolkit/nncf #2766

Torch FX/PyTorch 2 Export Quantization

### 🚀 Feature request Quantization is a widely used technique to accelerate models, particularly when using the [torch.compile](https://pytorch.org/tutorials/intermediate/torch_compile_tutorial.htm…

alexsu52 updated 3 days ago
1
NVIDIA/TensorRT-LLM #1810

Is it "INT8 or FP8" with "--use_weight_only --weight_only_pr…

### System Info GPU - A10 ### Who can help? @Tracin ### Information - [X] The official example scripts - [ ] My own modified scripts ### Tasks - [X] An officially supported task in the `…

aiiAtelier updated 1 week ago
1
ultralytics/ultralytics #13270

ultralytics 8.2.26 export to openvino int8 quantization, per…

### Search before asking - [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and found no similar bug report. ### YOLOv8 Component Export ### Bug When us…

qiangxinglin updated 2 weeks ago
12
intel/intel-extension-for-pytorch #648

Qwen-7b int8 inference fails when using run_quantization.py …

### Describe the issue 1. Tried running https://github.com/intel/intel-extension-for-pytorch/blob/release/2.3/examples/cpu/inference/python/llm/run.py to generate the q_config_summary file 2. Then…

MadhumithaSrini updated 3 weeks ago
4
NVIDIA/TensorRT #3914

Pointers for TensorRT model with uint8/int8 input

By using [pytorch-quantization](https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/index.html) i was able to create TensorRT engine models that are (almost) fully int8 and…

Michelvl92 updated 2 weeks ago
19
pytorch/pytorch #128578

NotImplementedError: Could not run 'quantized::linear' with …

### 🐛 Describe the bug After QAT training, the following error is reported for inference: NotImplementedError: Could not run 'quantized::linear' with arguments from the 'CPU' backend. This could be …

yangyyt updated 2 weeks ago
1
nanoporetech/bonito #388

Applying INT8 quantization, just like in Dorado

Hello all, I was curious how actively Bonito is still being used, as I read that Dorado nowadays converted a majority of its neural network code to INT8. I was interested in experimenting with t…

wtrinh8 updated 2 months ago
2
NVIDIA/TensorRT-Model-Optimizer #31

Onnx Quantization doesn't have quantize method

Hi, I have just installed the TensorRT-Model_Optimizer using `pip install "nvidia-modelopt[all]" --no-cache-dir --extra-index-url https://pypi.nvidia.com`. I was then using it to quantize a onnx m…

yixzhou updated 6 days ago
1
casper-hansen/AutoAWQ #435

can autoawq support int2,int3,int8 quantization?

can autoawq support int2,int3.int8 quantization? i see it only support int4 quantization now

ArlanCooper updated 2 months ago
3
apple/coremltools #2227

need help about both model weight and activation quantizatio…

from the issue "https://developer.apple.com/forums/thread/740518 how do we use the computational power of A17 Pro Neural Engine?" I learn that if i want to inference my mlmodel on my ipad pro with …

AndreaChiChengdu updated 1 month ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for int8-quantization

1000+ results
for int8-quantization