-
- [ ] [Announcing Together Inference Engine 2.0 with new Turbo and Lite endpoints](https://www.together.ai/blog/together-inference-engine-2)
# Announcing Together Inference Engine 2.0 with new Turbo …
-
/kind feature
**Describe the solution you'd like**
Hope add [https://github.com/xorbitsai/inference](https://github.com/xorbitsai/inference) as the kserve huggingface LLMs serving runtime
Xor…
-
**What would you like to be added/modified**:
Sedna is an edge-cloud synergy AI project incubated in KubeEdge SIG AI. Benefiting from the edge-cloud synergy capabilities provided by KubeEdge, Sed…
-
### Is this a new feature, an improvement, or a change to existing functionality?
New Feature
### How would you describe the priority of this feature request
Critical (currently preventing usage)
…
-
OpenAI can now exposes usage stats for the stream completion APIs
https://community.openai.com/t/usage-stats-now-available-when-using-streaming-with-the-chat-completions-api-or-completions-api/738156…
-
**Describe the bug**
After update 0.3.21
Getting -->
2024-07-27 13:34:07,646 - MemGPT.memgpt.server.server - DEBUG - Starting agent step
/MemGPT/memgpt/data_types.py:92: UserWarning: Failed to…
-
## ❓ General Questions
I have surely installed tvm in my device which has an arm64 on it and I want to run mlc_llm on my device to do model inference. But when I installed mlc_llm on my device li…
-
Hi,
can i do the own custom embedding model deployment with litserve.? any document on this
-
If you're encountering an error while pulling the `latest` tag of the `huggingface/text-generation-inference` Docker image, follow these steps to resolve it:
#### Steps to Fix
1. **Find the Spec…
-
### The Feature
Dnsure that we can access our fine-tuned Gemini via the Google AI Studio adapter. Haven't tested it yet.
### Motivation, pitch
You can find-tune Google Gemini Pro 1.0 with yo…