v2rockets / Loyal-Elephie

Your Trusty Memory-enabled AI Companion - Multilingual RAG chatbot optimized for local LLMs | OpenAI API Compatible
MIT License
208 stars 19 forks source link

Errors with Embedding Server, Possibly Related to Open AI Embedding Setup #4

Closed rooben-me closed 2 weeks ago

rooben-me commented 1 month ago

First of all, thank you for your incredible work.

I followed your instructions for the setup, but when I ran it, sometimes it gives me output, but sometimes it does not, and in the backend terminal, I can see these errors.

Screenshot 2024-06-03 at 2 14 01 AM

I think the issue is with the embedding server; I haven't set up Open AI embedding, do I want to set up an API for embedding with Open AI, as the screenshot shows an error with Open AI embedding.

Screenshot 2024-06-03 at 2 14 12 AM
v2rockets commented 1 month ago

Yes, it seems the embedding model is not well setup. You can either run a local embedding server or use an existing OpenAI embedding model. The setup is in settings.py.

Gerkinfeltser commented 1 month ago

Can you please recommend any howto's on how to install a local embedding server? I'm having similar issues & would prefer not to use OpenAI.

win10ogod commented 1 month ago

Can you please recommend any howto's on how to install a local embedding server? I'm having similar issues & would prefer not to use OpenAI.

I also have the same problem

v2rockets commented 1 month ago

Ok, a sentence-transformer example is added in this commit: https://github.com/v2rockets/Loyal-Elephie/commit/e6a46ad9987b7d549029126c54b22aa034ac50e6. Just choose a proper model.

rooben-me commented 1 month ago

@v2rockets Thank you very much

win10ogod commented 1 month ago

This is a WINDOWS error, Try using this script.

from flask import Flask, request, jsonify
from transformers import AutoTokenizer, AutoModel
import torch
import os
from datetime import datetime

# Set API key
API_KEY = "ollama"

class HFEmbeddingModel:
    def __init__(self, model_name: str):
        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
        self.model = AutoModel.from_pretrained(model_name)

    def get_embeddings(self, texts: list):
        inputs = self.tokenizer(texts, return_tensors="pt", padding=True, truncation=True)
        with torch.no_grad():
            outputs = self.model(**inputs)
        embeddings = outputs.last_hidden_state[:, 0, :].cpu().numpy()
        return embeddings.tolist()

def date_to_timestamp(date_str: str) -> int:
    """Convert date string to Unix timestamp (milliseconds)"""
    try:
        dt = datetime.strptime(date_str, '%Y-%m-%d')
        return int(dt.timestamp() * 1000)
    except ValueError as e:
        print(f"Date conversion error:{e}")
        raise

class ChromaDocManager:
    def __init__(self, collection):
        self.collection = collection

    def query_by_strings_with_time_range(self, query_strings, n_results, start_time, end_time):
        """Query by string and filter by time range"""
        start_timestamp = date_to_timestamp(start_time)
        end_timestamp = date_to_timestamp(end_time)

        print(f"query string: {query_strings}")
        print(f"Start timestamp: {start_timestamp}, End timestamp: {end_timestamp}")

        try:
            res = self.collection.query(
                query_texts=query_strings,
                n_results=n_results,
                where={"timestamp": {"$gte": start_timestamp, "$lte": end_timestamp}}
            )
            return res
        except ValueError as e:
            print(f"Query error:{e}")
            raise

#Initialize embedding model
hf_model = HFEmbeddingModel("BAAI/bge-m3")
app = Flask(__name__)

def check_api_key():
    """Check the API key in the request header"""
    auth_header = request.headers.get('Authorization')
    if auth_header and auth_header.startswith("Bearer "):
        token = auth_header.split(" ")[1]
        return token == API_KEY
    return False

@app.route('/v1/embeddings', methods=['POST'])
def get_embeddings():
    if not check_api_key():
        return jsonify({"error": "Unauthorized"}), 401

    data = request.json
    model = data.get('model', 'text-embedding-ada-002')  # Default model name
    inputs = data.get('input', [])

    if not inputs or not isinstance(inputs, list):
        return jsonify({"error": "Invalid input"}), 400

   # Generate embeddings using the Hugging Face model
    embeddings = hf_model.get_embeddings(inputs)

    return jsonify({
        "object": "list",
        "data": [{"object": "embedding", "index": i, "embedding": emb} for i, emb in enumerate(embeddings)],
        "model": model,
        "usage": {
            "prompt_tokens": sum(len(text.split()) for text in inputs),
            "total_tokens": sum(len(text.split()) for text in inputs)
        }
    })

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=7440)