int8-inference Search Results

1000+ results
for int8-inference

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

PaddlePaddle/PaddleSlim #1803

paddle_inference_eval验证性能，int8和fp32精度差距很大

压缩好的 bert mrpc模型，使用paddle_inference_eval验证性能，发现int8和fp32精度差距很大 --precision=fp32 84 --precision=fp16 84 --precision=int8 61

oydf updated 7 months ago
2
tensorflow/model-optimization #974

Full Int8 QAT not working

Just a quick question. I want my final model to be full int8 instead of float32 for input and outputs. I want the training to be as accurate as possible. Do I train with quantised input and outputs? B…

MATTYGILO updated 8 months ago
13
ultralytics/ultralytics #13373

Does YOLOv8 classification support FP16 and INT8?

### Search before asking - [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and fou…

pornpra updated 1 month ago
5
ROCm/AMDMIGraphX #3056

Add cache for Onnxruntime Compiled Programs

Add support to save and load pre compiled MIGraphX graph to From the MIGraphX EP to speed up time to inference. - [x] Save our precompiled graphs - [x] Load in precompiled graphs We do a similar so…

TedThemistokleous updated 4 months ago
1
PaddlePaddle/Paddle #43349

Paddle_inference int8 trt推理占的显存比fp32 trt推理占的要多能解释下吗

### 请提出你的问题 Please ask your question Paddle_inference int8 trt推理占的显存比fp32 trt推理占的要多能解释下吗！ Paddle_inference int8 trt推理占的显存比fp32 trt推理占的要多能解释下吗！ Paddle_inference int8 trt推理占的显存比fp32 trt推理占的要多能解释下吗！

marsbzp updated 2 years ago
6
openvinotoolkit/anomalib #2108

[Bug]: torch.onnx.export on loaded PatchcoreModel model thro…

### Describe the bug I am still facing the issue described in #1331 . However, I am directly using torch.onnx.export on loaded PatchcoreModel model. As a result, self.memory_bank is not being initial…

asnecemnnit updated 3 months ago
7
huggingface/transformers-bloom-inference #94

Are there fine-tuning and inference scripts available for in…

Where can I download bloom-7b? I noticed that int8 quantization is available, but is there an option for int4 quantization? What is the memory overhead for int4 and int8 when using LoRA or PTuning f…

dizhenx updated 1 year ago
1
NVIDIA-AI-IOT/cuDLA-samples #3

Error during Model Conversion Process - Impact Inquiry

Hello, I hope this message finds you well. I followed the tutorial to successfully convert the model; however, an error occurred during the model conversion process. I am seeking clarification on t…

liuweixue001 updated 2 months ago
6
ubisoft/ubisoft-laforge-ZeroEGGS #44

a streaming pipeline

Hello guys, I wrote a streaming inference pipeline in Python for this project, including torch jit script, int8 dynamic quantization, and streaming interface for the audio encoder and decoder (style v…

boji123 updated 5 months ago
6
huggingface/transformers #32478

[ I built it! ] Server application with on-the-fly quantizat…

### Feature request It would be immensely useful to have a server-application to serve up HF-Transformer and other Hub models as a service, similar to the how `llama.cpp` bundles the `llama-server`…

abgulati updated 1 month ago
1

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for int8-inference

1000+ results
for int8-inference