-
## 🚀 Feature
Multiple endpoints like `/embedding` or `/vlm/predict` or `/ocr/predict`.
### Motivation
I would like to host multiple models on a single GPU for different purposes. It would be …
-
-
### What happened?
When adding a model via admin UI Models -> Add new model it is not possible to add a mode (model_info).
The UI reports success: _Model created successfully. Wait 60s and refresh…
-
### Description
![image](https://github.com/user-attachments/assets/934f38a6-f3cc-4c8e-a7df-55dab0a24a58)
uploaded from chat are at the bottom with token counts, when reindex or on a fresh uploa…
-
I'm testing unsloth rope and here is my script:
```python
import torch
from unsloth.kernels.rope_embedding import fast_rope_embedding
from unsloth.models.llama import LlamaRotaryEmbedding as Uns…
-
Currently the Inference API sends a validation request to third party services when creating an inference endpoint with a task type of TEXT_EMBEDDING. This issue is to add validation calls to inferenc…
-
Apologies if this is a n00b question. I'm trying out the multimodal embedding examples here:
https://lancedb.github.io/lancedb/embeddings/default_embedding_functions/
Retrievals aren't too bad for C…
shaqq updated
1 month ago
-
- [ ] [README.md · BAAI/bge-reranker-large at main](https://huggingface.co/BAAI/bge-reranker-large/blob/main/README.md?code=true)
# README.md · BAAI/bge-reranker-large at main
## FlagEmbedding
Flag…
-
@Sazan-Mahbub has volunteered to lead this section. It may grow to include others' contributions as well.
-
### The Feature
Currently, I notice that the schema for the redis semantic cache enforces a dimension size of **1536**. This works well with OpenAI's text-embedding-ada-002 models but fail for any ot…