-
Hi there. Thanks for the excellent work. We are shocked by such a huge performance improvement.
When reproducing the results, I encountered the following problems. I'd be very appreciative if you cou…
-
### System information
j
### Code to reproduce issue
bb
### Describe the problem
I want to deploy a mlflow spark app to sagemaker. Is this possible? As I sucessfully push a image to ECR…
-
### Bug description
When running
`python3 triton-inference.py --input "Paris is the [MASK] of France."`
the following is returned:
```
Processing input...
Input processed.
Executing model...…
-
### 问题确认 Search before asking
- [X] 我已经查询[历史issue](https://github.com/PaddlePaddle/PaddleDetection/issues),没有发现相似的bug。I have searched the [issues](https://github.com/PaddlePaddle/PaddleDetection/issu…
-
We have fine-tuned Electra for question answering on a custom dataset and now would like to export it to a SavedModel to use with TensorFlow Serving. We're using TensorFlow 1.15.4. Here's our code:
…
-
i want to use bert for sentiment classification mission, i fine-tuned bert on a dataset and get an available model, and then, i found it is very slow to predict one sample, someone said the reason is …
-
/kind feature
**Describe the solution you'd like**
I was trying the `KF-Serving model web app` and tried to deploy a canary rollout that was working using Kubectl commands but I don’t see any opti…
-
/kind feature
i think this issue should be seperated
* https://github.com/kserve/kserve/issues/719 (NCNN | MNN | Openvino)
https://hub.docker.com/r/openvino/model_server
https://docs.openv…
-
I tried to launch OpenCoderPlus with the latest code of this repo and vLLM:
```bash
python -m ochat.serving.openai_api_server --model-type opencoder --model openchat/opencoderplus
```
It can w…
-
### Description
Instead of spinning up a GPU nodegroup, spin up a CPU nodegroup with Elastic Inference (GPU accelerated inference).
### Additional Context
* https://aws.amazon.com/machine-lea…