-
Hey, I am learning how to use Vitis AI 3.0 and trying to run the Quickstart tutorial for Vitis AI 3.0 `VCK190` resnet18.
At the Section of the "Pytorch turorial" :
`
Step 7 : Next, let’s run…
-
Prior to filing: check that this should be a bug instead of a feature request. Everything supported, including the compatible versions of TensorFlow, is listed in the overview page of each technique. …
-
https://github.com/intel/neural-compressor/tree/master/examples/onnxrt/nlp/huggingface_model/text_generation/llama/quantization/weight_only
bash run_quant.sh --input_model=./Meta-Llama-3.1-8B -…
-
```
got prompt
!!! Exception during processing!!! No GPU found. A GPU is needed for quantization.
Traceback (most recent call last):
File "/Users/liangbinsi/Documents/ComfyUI/execution.py", line…
-
### Checklist
- [X] Checked the issue tracker for similar issues to ensure this is not a duplicate
- [X] Read the documentation to confirm the issue is not addressed there and your configuration i…
-
### OpenVINO Version
openvino : 2024.3.0
### Operating System
Windows System
### Device used for inference
iGPU
### OpenVINO installation
PyPi
### Programming Language
Python
### Hardware Ar…
-
## Description
I recently attempted to utilize INT8 quantization with Stable Diffusion XL to enhance inference performance based on the claims made in a recent [TensorRT blog post](https://developer.…
teith updated
6 months ago
-
There might be scenarios where quants might be recreated (e.g. gemma) or templates could be updated in the model registry.
If there is some way to show it during ls or pull commands as info, it ca…
-
Hi! I'm trying to run the Q4_K_M quantization of Meta-Llama-3-8B-Instruct on my Mac (M2 Pro, 16GB VRAM) using llama-cpp-python, with the following test code:
```
from llama_cpp import Llama
llm4 …
-
### **Initial action plans**
Copying these things from the wav2vec2 repo for safe housekeeping.
* An immediate quantize could be to convert the fine-tuned model using TFLite APIs. [Post-trainin…