-
**Describe the bug**
When compiling to NPU, runtime error raised.
```
--> compiled_model = ov.compile_model(converted_model, device_name='NPU')
RuntimeError: Exception from src/inference/src/c…
-
### Issue type
Support
### Have you reproduced the bug with TensorFlow Nightly?
Yes
### Source
source
### TensorFlow version
2.13
### Custom code
Yes
### OS platform and distribution
_No re…
-
## Reporting a bug
When I called the LLVM package using tinygrad, I got the following erro.
Aborted (Core dumped) with Symbol not found __gnu_f2h_ieee.
This issue seems to have been resolved in ear…
-
0x416
Medium
# Lack of error handling when making blockless api call
## Summary
Lack of error handling when making blockless api call
## Vulnerability Detail
Error handling when making blockless…
-
### Search before asking
- [X] I had searched in the [issues](https://github.com/ray-project/kuberay/issues) and found no similar feature requirement.
/cc Bytedancer @Basasuya @Yicheng-Lu-llll
…
-
For now I am doing inference on images following the code in the inference.ipynb file. However, I have realised that it uses image paths to be able to make the inference.
However, due to limitatio…
-
### Description
Customer is interested in using Elasticsearch inference API with text generation models on Hugging Face where as of 8.15 we are limited supporting only text_embeddings
-
While we support batched inference like other constrained decoding libraries, the current implementation can be parallelized further. In particular, we can mask logits in batch and run several `kbnf` …
-
### 🚀 The feature, motivation and pitch
I launched a LLM service by vllm, and I use AsyncOpenAI function for high throughput output. like this:
`
async def async_llm_infer_sampling(prompt, a…
-
### Bug Description
The `e2e-wine-kfp-mlflow-kserve` test fails in Azure one-click deployment with the KServe `InferenceService` not being found. The pipeline run succeeds, then when the lightkube cl…