-
**Is your feature request related to a problem? Please describe.**
What we want is that, along with model inference, there is a file (either JSON, TXT, or XML) which contains all the information requ…
-
**Describe the bug**
I'm compressing a qwen2.5_7b model using `examples/quantization_2of4_sparse_w4a16/llama7b_sparse_w4a16.py`, but I failed to load the stage_sparsity model. The error is shown belo…
-
We have added support for returning the result's from `KibanaResponseFactory`. This works well with our inference when using the `ok` function since we can unwrap the object we pass back.
But when us…
-
**Describe the bug**
When running `inference.py --data-dir data --class-map class_map.txt --model efficientnet_b7 --num-classes 8 --checkpoint output/model_best.pth.tar` where `data` is a directory t…
-
### System Info
Ubuntu, CPU only, Conda, Python 3.10
### Information
- [x] The official example scripts
- [ ] My own modified scripts
### 🐛 Describe the bug
I am running a single node stack with …
-
Enhance RAGGENIE by integrating Ollama as a new LLM provider, enabling users to perform inferences with self-hosted language models.
**Task:**
- Develop an Ollama loader to facilitate inference g…
-
**Describe the bug**
I am trying to use Meta 1 and 2 which require inference support.
I am getting this error: `Unsupported model us.meta.llama3-1-70b-instruct-v1:0, please use models API to get…
-
### Describe the issue
Hello, I am interested in using v1.20.0 on openvino hardware as the new version claims to have optimized first inference latency. It seems that v1.20.0 has been released for [o…
-
### 🔎 Search Terms
generic function spread regression inference
### 🕗 Version & Regression Information
- This is the behavior in every version I tried, and I reviewed the FAQ for entries about gene…
-
### System Info
**Platform:** Dell 760xa with 4x L40S GPUs
**OS Description:** Ubuntu 22.04.5 LTS
**GPU:** NVIDIA-SMI 550.90.07 Driver Version: 550.90.07 CUDA Version: 12.4
*…