-
**Issue description:** When using a screen reader and submitting a query or message there is no indication the chat thread has been updated.
**WCAG Criteria**:[ SC 1.3.1 Info and Relationships](ht…
-
### What happened?
I am currently using litellm proxy via api only.
This is the container image version `ghcr.io/berriai/litellm:main-v1.44.2`
TLDR; team budget seems to work, individual me…
-
I have a function in a plug that calls fetch like this, where body is a string (json-encoded object):
``` javascript
const response = await fetch(
aiSettings.openAIBaseUrl + "/chat/comp…
-
An expert in TPU compiler writing can potentially introduce sampling techniques into programs for specific purposes. Here's a breakdown of the concept:
**Sampling for TPU Programs:**
* **Expert-…
-
### What happened?
Here is an example usage:
```
chunks = []
try:
async for chunk in stream_resp:
text = chunk.choices[0].delta.content or ""
yield text
chunks.ap…
-
### Bug Description
I'm doing RAG using llama-index.The model is Phi3-mini-4k. I have experimented all the models that supports sub-queryengine. When comparing those models, I got pretty good results…
-
I'd like to run topic_model.topics_over_time() but only on a specific subset of documents and topics. Sometimes when working with a large corpus with lots of topics, running it on all documents and to…
-
Using the instructions here: https://github.com/ray-project/ray-llm#how-do-i-deploy-multiple-models-at-once I'm trying to host two models on a single A100 80G.
Two bundles are generated for the pla…
-
![error](https://github.com/user-attachments/assets/c6a351db-0074-4db7-bc68-9b6eb9f3081f)
After running the app.py file and putting the model in the web_app_storage/models folder. I get the this er…
-
### Prerequisites
- [X] I am running the latest code. Mention the version if possible as well.
- [X] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md)…