neural-compressor Search Results

372 results
for neural-compressor

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

intel/intel-extension-for-transformers #1407

Fails to load saved model : Trying to set a tensor of shape …

Loading saved model runs into following error It also takes a very long time to run and save quantized models. ``` 2024-03-21 08:48:58 [INFO] loading weights file models/4_bit_llama2-rtn/model.sa…

kranipa updated 2 months ago
8
microsoft/Olive #852

Is this pass flow possible for Stable Diffusion?: OrtTransfo…

**Describe the bug and context** I'm trying to quantize an optimized Stable Diffusion model. I got to know that `IncDynamicQuantization` has less reduction in inference speed than `OnnxDynamicQuanti…

lshqqytiger updated 8 months ago
46
onnx/models #622

Evaluated classifications onnx models almost with zero accur…

I evaluated all of the classifications models according to their preprocessing description with imagenet: Models: ----------- - squeezenet1.0-12.onnx - bvlcalexnet-12.onnx - caffenet-12.onnx …

Apisteftos updated 7 months ago
4
unslothai/unsloth #84

feature request: Support for TEQ

As mentioned in this paper - TEQ: Trainable Equivalent Transformation for Quantization of LLMs. The authors of this paper are claiming - "The training process is lightweight, requiring only 1K steps …

shauryr updated 7 months ago
1
intel/neural-compressor #1610

How to perform int8 quantisation (not uint8) using ONNX?

Hi team, I am having issue quantizing the network consisting of Conv and Linear layers using **int8** weights and activations in ONNX. I have tried setting it using op_type_dict, however it doesn't wo…

paul-ang updated 6 months ago
1
pytorch/tutorials #2690

[BUG] - segmentation fault occur when follow the tutorial

### Add Link https://pytorch.org/tutorials/recipes/intel_neural_compressor_for_pytorch.html ### Describe the bug Follow the tutorial, I write this code, and find that the segmentation fault occur w…

statfs updated 10 months ago
4
intel/intel-extension-for-transformers #1689

ModuleNotFoundError: No module named 'neural_compressor.conf…

I followed the quick start guide, and an error occurred when I tried to run the python script. It seems to be an dependency error. I searched the internet and did not find any solution. How to solve t…

ErvinXie updated 6 days ago
7
intel/neural-compressor #1972

Quantization failed

https://github.com/intel/neural-compressor/tree/master/examples/onnxrt/nlp/huggingface_model/text_generation/llama/quantization/weight_only bash run_quant.sh --input_model=./Meta-Llama-3.1-8B -…

endomorphosis updated 1 month ago
1
intel/neural-compressor #1580

PostTrainingQuantConfig(quant_level='auto', device='npu', ba…

The below PostTrainingQuantConfig produces fp32 ops for NPU using 2.4.1. Models with int8 and fp16 ops would be preferred for NPU. conf=PostTrainingQuantConfig(quant_level='auto', device='n…

kleiti updated 6 months ago
1
microsoft/onnxruntime #19494

[Performance] BGE Reranker / BERT Crossencoder Onnx model la…

### Describe the issue I am using the Int8 quantized version of BGE-reranker-base model converted to the Onnx model. I am processing the inputs in batches. Now the scenario is that I am experiencing …

ojasDM updated 5 months ago
2

上一页 1...1 2 3 4 5 6 7...38 下一页

372 results for neural-compressor

372 results
for neural-compressor