Closed Ananderz closed 12 months ago
We would like to explore this option but will not block LLM inferencing with this, we will just have OpenAI embeddings be the only option for non-embedder services (all LLM options except OpenAI & Azure).
So phase one is any LLM selection for inferencing with no option for embedding engine selection.
Then we add more Embedding engine selection when we can find a normalized interface for it. It would be a massive pain to do sentence transformation and each embedding engine has a different API. Most are running HF models it seems
Do you know of any services that normalize the embedding API like LiteLLM does for LLM inferencing?
hey @timothycarambat litellm supports the huggingface text embedding inference api which should help here
hey @timothycarambat litellm supports the huggingface text embedding inference api which should help here
Indeed, I saw this, but it also requires us to move our backend to python or run the proxy server to have that interface available
Would just be an add to the docker-compose I believe:
version: '3.9'
name: anythingllm
networks:
anything-llm:
driver: bridge
services:
anything-llm:
container_name: anything-llm
image: anything-llm:latest
platform: linux/amd64
build:
context: ../.
dockerfile: ./docker/Dockerfile
args:
ARG_UID: ${UID}
ARG_GID: ${GID}
volumes:
- "../server/storage:/app/server/storage"
- "../collector/hotdir/:/app/collector/hotdir"
- "../collector/outputs/:/app/collector/outputs"
user: "${UID}:${GID}"
ports:
- "3001:3001"
env_file:
- .env
networks:
- anything-llm
extra_hosts:
- "host.docker.internal:host-gateway"
litellm:
image: ghcr.io/berriai/litellm:latest
ports:
- '8000:8000'
environment:
- PORT=8000
- OPENAI_API_KEY=your-api-key
Hi,
I saw that there are local LLM:s in the roadmap. It won't make much sense if you don't also use local sentecetransformers (MTEB) like instructor or e5.
I would suggest to have this looked into before local LLM:s because you could potentially use an API endpoint for local LLM:s