-
I did some tests in order to find better parameter to speed up, and it appears that there hasn't been a significant change in TTFT (Time To First Token). Is my TTFT correct? I feel it might be a bit t…
-
## Bug Report
### System information
- **OS Platform and Distribution**: macOS
- **TensorFlow Serving installed from**: binary
- **TensorFlow version**: 2.14.1
### Describe the problem
We us…
-
### Feature request:
- Tensorflow Serving: https://github.com/tensorflow/serving.
### Use case:
### UI Example:
-
/kind feature
**Describe the solution you'd like**
Currently it is not possible to specify at what path the downloaded model should be available in the model server container. The downloaded model…
-
Is there a way to cap the number (e.g. CPU cores, CUDA MPS threads) of resources assigned to each model in a multi-model tensorflow server?
The only way (straightforward way and not considering lower…
-
### 🚀 The feature, motivation and pitch
Thanks to our amazing community, we have gathered a set of good chat template for models. These template are useful when the original model's `tokenizer_config…
-
**Is your feature request related to a problem? Please describe.**
In the [first implementation of model registry and serving implemention](https://github.com/opendatahub-io/model-registry/issues/173…
-
### Anything you want to discuss about vllm.
Users may see the following error when trying to run vllm:
```
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
Fi…
-
Hi, I trained a model with custom op and export it using `saved_model` API and I would like to serve it using tfgo. However, since tfgo binds to tensorflow C library(more precisely `libtensorflow.so`)…
-
**Describe the bug**
KServe community follow an approach to release all repos together irrespective if there are code changes in independent repos or not.
**For example**, In release `v0.11.2`, a …