New implementation of locally hosted paper-qa according to the README causes a pydantic error

alexanderchang1 commented 6 months ago

from paperqa import Docs, LlamaEmbeddingModel, OpenAILLMModel
from openai import AsyncOpenAI

# start llamap.cpp client with

local_client = AsyncOpenAI(
    base_url="http://localhost:8080/v1",
    api_key = "sk-no-key-required"
)

docs = Docs(client=local_client,
            embedding=LlamaEmbeddingModel(),
            llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))

---------------------------------------------------------------------------
ValidationError                           Traceback (most recent call last)
Cell In[17], line 12
      4 # start llamap.cpp client with
      6 local_client = AsyncOpenAI(
      7     base_url="http://localhost:8080/v1",
      8     api_key = "sk-no-key-required"
      9 )
     11 docs = Docs(client=local_client,
---> 12             embedding=LlamaEmbeddingModel(),
     13             llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))

File /bgfs/alee/LO_LAB/Personal/Alexander_Chang/alc376/envs/LOpaperqa/lib/python3.10/site-packages/pydantic/main.py:171, in BaseModel.__init__(self, **data)
    169 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
    170 __tracebackhide__ = True
--> 171 self.__pydantic_validator__.validate_python(data, self_instance=self)

ValidationError: 1 validation error for LlamaEmbeddingModel
name
  Field required [type=missing, input_value={}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.6/v/missing

philippbayer commented 6 months ago

I seem to have the same issue with Gemini: it looks like the pydantic setting wants there to be a 'name' field in both LLamaEmbeddingModel and GoogleGenerativeAIEmbeddings(). Fuller error:

My Docs():

docs = Docs(llm_model = ChatGoogleGenerativeAI(model='gemini-pro'),
 embedding = GoogleGenerativeAIEmbeddings(model="models/embedding-001"))

This is what data looks like before the error:

{'llm_model': ChatGoogleGenerativeAI(name='gemini-pro', model='gemini-pro', client= genai.GenerativeModel(
   model_name='models/gemini-pro',
   generation_config={}.
   safety_settings={}
)), 'embedding': GoogleGenerativeAIEmbeddings(model='models/embedding-001', task_type=None, google_api_key=None, 
client_options=None, transport=None)}

And the error:

Traceback (most recent call last):

  File ".//python3.10/site-packages/paperqa/docs.py", line 129, in __init__
    super().__init__(**data)
  File ".//python3.10/site-packages/pydantic/main.py", line 171, in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Docs
name
  Input should be a valid string [type=string_type, input_value=GoogleGenerativeAIEmbeddi...ns=None, transport=None), input_type=GoogleGenerativeAIEmbeddings]

~~So neither GoogleGenerativeAIEmbeddings nor (I assume) LlamaEmbeddingModel have a name field. ChatGoogleGenerativeAI has one ('gemini-pro'). Not sure right now how to add that.~~ So the pydantic rules in docs.py say that 'name' should be a string, but in our case it's not a string, it's the entire model. Don't know enough about pydantic to fix this right now.

Edit: OK here's the issue: https://github.com/whitead/paper-qa/blob/350225c06ec0fd85d321ee63b744194f3d1259cb/paperqa/docs.py#L159

data['embedding'] is assumed to be a string, not any model.

This code kind of works:

docs = Docs(llm_model = ChatGoogleGenerativeAI(model='gemini-pro'),
 embedding = 'sentence_transformers')

This will use the local SentenceTransformerEmbeddingModel() instead of Gemini. Then I run into the next error as it assumes that llm_model is wrapped within LLMModel, that's for later today

whitead commented 6 months ago

Sorry about that - yes embedding is for a string.

Use this syntax instead for passing a model (I updated the README - thanks)

from paperqa import Docs, LlamaEmbeddingModel, OpenAILLMModel
from openai import AsyncOpenAI

# start llamap.cpp client with

local_client = AsyncOpenAI(
    base_url="http://localhost:8080/v1",
    api_key = "sk-no-key-required"
)

docs = Docs(client=local_client,
            embedding_model=LlamaEmbeddingModel(),
            llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))

For @philippbayer - your example would be using Langchain (I assume!):


docs = Docs(llm = "langhcain",
            client = ChatGoogleGenerativeAI(model='gemini-pro'),
            embedding = "langchain",
            embedding_client = GoogleGenerativeAIEmbeddings(model="models/embedding-001"))

GordonMcGregor commented 6 months ago

This code (and in the readme)


from paperqa import Docs, LlamaEmbeddingModel, OpenAILLMModel
from openai import AsyncOpenAI

# start llamap.cpp client with

local_client = AsyncOpenAI(
    base_url="http://localhost:8080/v1",
    api_key = "sk-no-key-required"
)

docs = Docs(client=local_client,
            embedding_model=LlamaEmbeddingModel(),
            llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))

still generates this error:

pydantic_core._pydantic_core.ValidationError: 1 validation error for LlamaEmbeddingModel name

GordonMcGregor commented 6 months ago

I think the problem in both cases is that LlamaEmbeddingModel() requires a name argument - its that instantiation that's complaining, not Docs()

GordonMcGregor commented 6 months ago

also embedding_model isn't a valid argument for Docs

pydantic_core._pydantic_core.ValidationError: 1 validation error for Docs
embedding_model
  Extra inputs are not permitted [type=extra_forbidden, input_value=LlamaEmbeddingModel(name=...h_size=4, concurrency=1), input_type=LlamaEmbeddingModel]

GordonMcGregor commented 6 months ago

embedding_client perhaps instead?


from paperqa import Docs, LlamaEmbeddingModel, OpenAILLMModel
from openai import AsyncOpenAI

# start llamap.cpp client with

local_client = AsyncOpenAI(
    base_url="http://localhost:8080/v1",
    api_key = "sk-no-key-required"
)

docs = Docs(client=local_client,
            embedding_client=LlamaEmbeddingModel(name='llama'),
            llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))

alexanderchang1 commented 6 months ago

That fixed the pydantic issue, but revealed another issue

OpenAI API gets embeddings via POST to this endpoint '/embeddings' (https://platform.openai.com/docs/api-reference/embeddings)

But the locally hosted llamafile server gets embeddings via this endpoint '/embedding' (https://github.com/Mozilla-Ocho/llamafile/blob/main/llama.cpp/server/README.md#api-endpoints)

This results in a File Not Found API error when trying to add documents and embed text on a locally hosted LLM.

Llamafile is currently on version 0.6.2 which was last synchronized with llama.cpp on 1-27-2024. However the llama.cpp commit that fixes this problem wasn't committed until 1-29-2024 (https://github.com/ggerganov/llama.cpp/pull/5190/commits/94613299b4d52ab0f79b3eaa6c870c883d470863). So until the next version of llamafile syncs to a llama.cpp sync after that date paperqa is not compatible with a llamafile I believe.

whitead commented 5 months ago

Hi @alexanderchang1 - maybe try using SentenceTransformer?

from paperqa import Docs, OpenAILLMModel, print_callback

docs = Docs(client=client,            
            embedding="sentence-transformers",
            llm_result_callback=print_callback,
            llm_model=OpenAILLMModel(config=dict(model="my-llama.cpp-llm", temperature=0.1, max_tokens=512)))

or I added a new embedding that is just keyword based:

from paperqa import Docs, OpenAILLMModel, print_callback

docs = Docs(client=client,            
            embedding="sparse",
            llm_result_callback=print_callback,
            llm_model=OpenAILLMModel(config=dict(model="my-llama.cpp-llm", temperature=0.1, max_tokens=512)))

alexanderchang1 commented 5 months ago

Hi @whitead,

Still getting the File Not Found error due to /embeddings and /embedding conflicts in llamafile.

alexanderchang1 commented 5 months ago

Hi, @whitead i got it to work with the following method.

llamafile servers are currently incompatible and result in File Note Found error due to endpoint differences.

Instead, a user has to install the latest llama-cpp-python bindings for web server (https://github.com/abetlen/llama-cpp-python#web-server), and then run the command locally.

python -m llama_cpp.server  --model ./models/llama-2-7b.Q5_K_M.gguf --n_gpu_layers 35 --port 8010

Then you can run it locally via

from paperqa import Docs, OpenAILLMModel, print_callback, LlamaEmbeddingModel, SentenceTransformerEmbeddingModel
from openai import AsyncOpenAI

client = AsyncOpenAI(
    base_url="http://localhost:8010/v1",
    api_key = "sk-no-key-required"
)

docs = Docs(client=client,            
            embedding="sentence-transformers",
            llm_result_callback=print_callback,
            llm_model=OpenAILLMModel(config=dict(model="my-llama.cpp-llm", temperature=0.1, max_tokens=512)))

And the remainder of the code is the same, however the performance drops a lot using sentence transformers. Is there an updated version on how to use LlamaEmbeddingModel instead? Or any other model, my hope is to eventually use mistral 8x7B.

jamesbraza commented 2 days ago

Hello everyone, we have just released version 5, which completely outsources all LLM management to BerriAI/litellm.

If your issue persists, please reopen a new issue using paper-qa>=5

Future-House / paper-qa

New implementation of locally hosted paper-qa according to the README causes a pydantic error #243