Local embeddings with sentence transformers

Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.

https://anythingllm.com

MIT License

26.61k stars 2.66k forks source link

Local embeddings with sentence transformers #300

Closed Ananderz closed 12 months ago

Ananderz commented 1 year ago

Hi,

I saw that there are local LLM:s in the roadmap. It won't make much sense if you don't also use local sentecetransformers (MTEB) like instructor or e5.

I would suggest to have this looked into before local LLM:s because you could potentially use an API endpoint for local LLM:s

timothycarambat commented 1 year ago

We would like to explore this option but will not block LLM inferencing with this, we will just have OpenAI embeddings be the only option for non-embedder services (all LLM options except OpenAI & Azure).

So phase one is any LLM selection for inferencing with no option for embedding engine selection.

Then we add more Embedding engine selection when we can find a normalized interface for it. It would be a massive pain to do sentence transformation and each embedding engine has a different API. Most are running HF models it seems

Do you know of any services that normalize the embedding API like LiteLLM does for LLM inferencing?

krrishdholakia commented 1 year ago

hey @timothycarambat litellm supports the huggingface text embedding inference api which should help here

timothycarambat commented 1 year ago

hey @timothycarambat litellm supports the huggingface text embedding inference api which should help here

Indeed, I saw this, but it also requires us to move our backend to python or run the proxy server to have that interface available

krrishdholakia commented 1 year ago

Would just be an add to the docker-compose I believe:

version: '3.9'

name: anythingllm

networks:
  anything-llm:
    driver: bridge

services:
  anything-llm:
    container_name: anything-llm
    image: anything-llm:latest
    platform: linux/amd64
    build:
      context: ../.
      dockerfile: ./docker/Dockerfile
      args:
        ARG_UID: ${UID}
        ARG_GID: ${GID}
    volumes:
      - "../server/storage:/app/server/storage"
      - "../collector/hotdir/:/app/collector/hotdir"
      - "../collector/outputs/:/app/collector/outputs"
    user: "${UID}:${GID}"
    ports:
      - "3001:3001"
    env_file:
      - .env
    networks:
      - anything-llm
    extra_hosts:
      - "host.docker.internal:host-gateway"
litellm:
    image: ghcr.io/berriai/litellm:latest
    ports:
      - '8000:8000'
    environment:
      - PORT=8000
      - OPENAI_API_KEY=your-api-key

timothycarambat commented 12 months ago

resolved by https://github.com/Mintplex-Labs/anything-llm/pull/361