-
/kind bug
**What steps did you take and what happened:**
I ran the inference service on custom xgboost model that I trained and saved in .joblib extension using the pvc storage option, followed th…
-
### Search before asking
- [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and fou…
-
This PR is part of an effort to improve integration of feast with model serving. Also see #4139 and accompanying draft [RFC](https://docs.google.com/document/d/1PzBbTs_8R73XhuDq3CO0slmGy5S_ci2rwtbx1L-…
-
### System Info
Image: v1.2 CPU
Model used: jinaai/jina-embeddings-v2-base-de
Deployment: Docker / RH OpenShift
### Information
- [X] Docker
- [ ] The CLI directly
### Tasks
- [X] An officiall…
-
### Feature request
Currently the Service Name for OTLP is hard-coded as "text-generation-inference.server"
Could an environment variable be added which could set this. Something like...
resour…
-
### What happened?
I have tried a couple of different models hosted on Nvidia NMI, but none of them support system messages, frequency penalty, or presence penalty. This is causing errors that (I t…
-
/kind bug
**What steps did you take and what happened:**
Deployed inferenceservice iris-classifier-deployment:
```
% kubectl get inferenceservices
NAME URL …
-
**Link to the notebook**
In the code below I am clearly passing a different instance type where I want to deploy my trained moedl
```
finetuned_predictor = estimator.deploy(
instance_type='ml.…
-
**Description**
After using the Python vllm backend, Triton crashed with signal 11. The model had been loaded and preheated for some time before the crash occurred.
**Triton Information**
What ve…
-
[TF Lite Micro (link - supported platforms)](https://www.tensorflow.org/lite/microcontrollers#supported_platforms) makes local node ML inferencing possible, enabling powerful example applications like…