-
### Is your enhancement related to a problem? Please describe
While the inference server page is listing the information, those are not easy to decipher. And we would like to introduce more sections …
-
### Search before asking
- [X] I have searched the Inference [issues](https://github.com/roboflow/inference/issues) and found no similar bug report.
### Bug
If you install `inference-gpu` on a mac…
-
OpenAI can now exposes usage stats for the stream completion APIs
https://community.openai.com/t/usage-stats-now-available-when-using-streaming-with-the-chat-completions-api-or-completions-api/738156…
-
### Describe the issue
This issue is a place to discuss the impact of not being able to rely on the `name` field on messages and existing, or proposed, solutions to cater for this.
---
The `n…
-
### Jan version
0.5.4
### Describe the Bug
Jan is not using GPU
### Steps to Reproduce
Model: Bielik-11B-v2.3-Instruct.Q8_0.gguf
GPU: NVIDIA RTX 4070 SUPER
### Screenshots / Logs
![obraz_2024-…
-
/kind feature
**Describe the solution you'd like**
Hope add [https://github.com/xorbitsai/inference](https://github.com/xorbitsai/inference) as the kserve huggingface LLMs serving runtime
Xor…
-
### Describe the bug
[2024-08-08T07:17:44,731][WARN ][a.d.h.t.HuggingFaceTokenizer] [opensearch-ml-node] maxLength is not explicitly specified, use modelMaxLength: 512
[2024-08-08T07:17:44,737][ER…
-
- [ ] [Announcing Together Inference Engine 2.0 with new Turbo and Lite endpoints](https://www.together.ai/blog/together-inference-engine-2)
# Announcing Together Inference Engine 2.0 with new Turbo …
-
http://127.0.0.1:8000/krai_qaic_task/benchmark/QuickBenchmarking
-
http://127.0.0.1:8000/tmp/License/#device-details
QAIC