int8-quantization Search Results

1000+ results
for int8-quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

amd/RyzenAI-SW #102

How to apply INT8 quantization on Llama2 model?

Hello, I have tried to run Llama2 model with 3 bit and 4 bit quantization. But is there a way that I can apply and run INT 8 quantized Llama2 model on AMD? Regards, Ashima

AshimaBisla updated 2 months ago
1
tensorflow/tensorflow #53552

DSP Overflow - high pixel values are being clamped when runn…

Hi, I trained a keras model to extract gray-level segmentation maps. I converted the model to TFLite and quantized the model. The quantized model produces similar results on CPU and DSP HWs, if th…

aviaisr updated 2 years ago
2
intel/intel-extension-for-pytorch #648

Qwen-7b int8 inference fails when using run_quantization.py …

### Describe the issue 1. Tried running https://github.com/intel/intel-extension-for-pytorch/blob/release/2.3/examples/cpu/inference/python/llm/run.py to generate the q_config_summary file 2. Then…

MadhumithaSrini updated 1 month ago
6
microsoft/onnxruntime-inference-examples #367

RUNTIME_EXCEPTION : Non-zero status code returned while runn…

from diffusers import StableDiffusionPipeline pipe = StableDiffusionPipeline.from_pretrained("echarlaix/stable-diffusion-v1-5-inc-int8-dynamic").to("cpu") # for reducing memory consumption get a…

siddharth062022 updated 3 months ago
1
TexasInstruments/edgeai-torchvision #7

Quantized Checkpoints have Floating-Point Weights

### 🐛 Describe the bug Hello, I'm using the QuantTrainModule to train a MobileNetV2 model (using the MobileNetV2 class in this repo), and the quantized checkpoints have 32-bit floating-point weigh…

IsidoraR updated 2 years ago
20
microsoft/Olive #787

Optimized model slower than original one on CUDAExecutionPro…

### What happened? Hello, I've been experimenting with some Olive passes on a custom model containing a transformer and some extra layers. Using the passes seem to slow down both the throughput and …

nicolas-mng updated 9 months ago
12
InternLM/xtuner #719

执行 NPROC_PER_NODE=2 xtuner train /root/StableDiffusionGPT/co…

error log： Generating train split: 3457 examples [00:00, 14292.20 examples/s] Map (num_proc=32): 0%| | 0/3457 [00:00

LTtt456c updated 4 months ago
2
shashank140195/finetune_Llama2_LORA #1

4bit or 8bit quantization?

Hi, Huge fan of your work. I was wondering in your code are you using a 4bit or 8bit quantization lora?

Tizzzzy updated 7 months ago
4
pytorch-labs/gpt-fast #129

int4/int4-gptq support in Mixtral 8x7B

Hi maintainers @yanboliang @Chillee , I saw Int8 Weight-Only Quantization is enabled in Mixtral 8x7B. And the next step should be supporting int4 and int4-gptq. May I know the timeline of enabli…

yanbing-j updated 1 week ago
4
usefulsensors/openai-whisper #6

Only recognises 1st few seconds

Is it due to mel.n_len3000 that is the max of a single inference? If you feed some of the longers samples that whisper.cpp uses I presume its the mel.n_len3000 max as know they are much longer. ``…

StuartIanNaylor updated 1 year ago
16

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for int8-quantization

1000+ results
for int8-quantization