serving Search Results - Githubissues

1000+ results
for serving

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #8074

[Feature]: Support multi-node serving on Kubernetes

### 🚀 The feature, motivation and pitch Hi, I'm currently working on **deploying vLLM distributed on multi-node in k8s cluster**. I saw that the official documentation provided a link by using [LWS…

linnlh updated 3 days ago
6
sveltejs/kit #12815

Hooking into asset serving to avoid 404

### Describe the problem From what I know it's not possible to hook into the serving of build artifacts in the `_app` folder. The biggest use case I have for it is to be able to hook into it to preve…

andersekdahl updated 1 month ago
11
nestjs/swagger #3157

Add option sto disable json and yaml API definition serving

### Is there an existing issue that is already proposing this? - [X] I have searched the existing issues ### Is your feature request related to a problem? Please describe it There is no way to disa…

wiliamsouza updated 2 days ago
2
InftyAI/llmaz #85

Parallel model serving

**What would you like to be added**: Similar to kserve https://kserve.github.io/website/latest/modelserving/v1beta1/custom/custom_model/#parallel-model-inference **Why is this needed**: *…

kerthcet updated 3 months ago
1
cert-manager/cert-manager #7138

Failed to generate serving certificate

**Describe the bug**: After we updated cert-manager to `v1.15.0`, we started to see the following error in the logs of cert-manager webhook pod: `"Failed to generate serving certificate, retry…

syurevich updated 2 weeks ago
29
outofcoffee/imposter #630

soap issue - Serving mock example - Docker image

Issue: SOAP Plugin returning mock example instead of the defined response when using shared folder mapping I am encountering an issue with the Imposter SOAP plugin when mapping a shared folder for …

DeveOwn updated 2 weeks ago
1
LMCache/LMCache #232

[Feature Request] Distributed serving is failing with CUDA o…

**Is your feature request related to a problem? Please describe.** Tried to run custom 40B model, whose weights can be loaded with 2 80GB GPU's VRAM. lmcache is able to load small models with in sin…

kzos updated 2 days ago
3
kubernetes-sigs/wg-serving #14

[Serving Catalog] Add HPA configurations

One important (and non-trivial) aspect of running model servers today is to ensure they are able to scale horizontally in response to load. Today, traditional CPU/Memory-based autoscaling are not suff…

raywainman updated 1 month ago
2
vllm-project/vllm #9739

[Bug]: ValueError: At most 1 image(s) may be provided in one…

### Your current environment vllm-openai/v06.3.1.post-1 ### Model Input Dumps a_request: None, prompt_adapter_request: None. 2024-10-27 23:04:39 INFO 10-27 09:04:39 engine.py:290] Added request ch…

eav-solution updated 3 weeks ago
5
canonical/knative-operators #243

Can't integrate rocks to `securityContext.runAsNonRoot`: `tr…

### Bug Description While working on `net-istio-webhook` extension rock for knative we had encountered a problem where we can't run rocks in `securityContext.runAsNonRoot`: `true` Kubernetes deploym…

misohu updated 1 week ago
3

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for serving

1000+ results
for serving