Closed alexanderchang1 closed 2 days ago
I seem to have the same issue with Gemini: it looks like the pydantic setting wants there to be a 'name' field in both LLamaEmbeddingModel and GoogleGenerativeAIEmbeddings(). Fuller error:
My Docs():
docs = Docs(llm_model = ChatGoogleGenerativeAI(model='gemini-pro'),
embedding = GoogleGenerativeAIEmbeddings(model="models/embedding-001"))
This is what data
looks like before the error:
{'llm_model': ChatGoogleGenerativeAI(name='gemini-pro', model='gemini-pro', client= genai.GenerativeModel(
model_name='models/gemini-pro',
generation_config={}.
safety_settings={}
)), 'embedding': GoogleGenerativeAIEmbeddings(model='models/embedding-001', task_type=None, google_api_key=None,
client_options=None, transport=None)}
And the error:
Traceback (most recent call last):
File ".//python3.10/site-packages/paperqa/docs.py", line 129, in __init__
super().__init__(**data)
File ".//python3.10/site-packages/pydantic/main.py", line 171, in __init__
self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Docs
name
Input should be a valid string [type=string_type, input_value=GoogleGenerativeAIEmbeddi...ns=None, transport=None), input_type=GoogleGenerativeAIEmbeddings]
So neither GoogleGenerativeAIEmbeddings nor (I assume) LlamaEmbeddingModel have a So the pydantic rules in docs.py say that 'name' should be a string, but in our case it's not a string, it's the entire model. Don't know enough about pydantic to fix this right now.name
field. ChatGoogleGenerativeAI has one ('gemini-pro'). Not sure right now how to add that.
Edit: OK here's the issue: https://github.com/whitead/paper-qa/blob/350225c06ec0fd85d321ee63b744194f3d1259cb/paperqa/docs.py#L159
data['embedding'] is assumed to be a string, not any model.
This code kind of works:
docs = Docs(llm_model = ChatGoogleGenerativeAI(model='gemini-pro'),
embedding = 'sentence_transformers')
This will use the local SentenceTransformerEmbeddingModel() instead of Gemini. Then I run into the next error as it assumes that llm_model is wrapped within LLMModel
, that's for later today
Sorry about that - yes embedding
is for a string.
Use this syntax instead for passing a model (I updated the README - thanks)
from paperqa import Docs, LlamaEmbeddingModel, OpenAILLMModel
from openai import AsyncOpenAI
# start llamap.cpp client with
local_client = AsyncOpenAI(
base_url="http://localhost:8080/v1",
api_key = "sk-no-key-required"
)
docs = Docs(client=local_client,
embedding_model=LlamaEmbeddingModel(),
llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))
For @philippbayer - your example would be using Langchain (I assume!):
docs = Docs(llm = "langhcain",
client = ChatGoogleGenerativeAI(model='gemini-pro'),
embedding = "langchain",
embedding_client = GoogleGenerativeAIEmbeddings(model="models/embedding-001"))
This code (and in the readme)
from paperqa import Docs, LlamaEmbeddingModel, OpenAILLMModel
from openai import AsyncOpenAI
# start llamap.cpp client with
local_client = AsyncOpenAI(
base_url="http://localhost:8080/v1",
api_key = "sk-no-key-required"
)
docs = Docs(client=local_client,
embedding_model=LlamaEmbeddingModel(),
llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))
still generates this error:
pydantic_core._pydantic_core.ValidationError: 1 validation error for LlamaEmbeddingModel name
I think the problem in both cases is that LlamaEmbeddingModel() requires a name argument - its that instantiation that's complaining, not Docs()
also embedding_model isn't a valid argument for Docs
pydantic_core._pydantic_core.ValidationError: 1 validation error for Docs
embedding_model
Extra inputs are not permitted [type=extra_forbidden, input_value=LlamaEmbeddingModel(name=...h_size=4, concurrency=1), input_type=LlamaEmbeddingModel]
embedding_client perhaps instead?
from paperqa import Docs, LlamaEmbeddingModel, OpenAILLMModel
from openai import AsyncOpenAI
# start llamap.cpp client with
local_client = AsyncOpenAI(
base_url="http://localhost:8080/v1",
api_key = "sk-no-key-required"
)
docs = Docs(client=local_client,
embedding_client=LlamaEmbeddingModel(name='llama'),
llm_model=OpenAILLMModel(config=dict(model="my-llm-model", temperature=0.1, frequency_penalty=1.5, max_tokens=512)))
That fixed the pydantic issue, but revealed another issue
OpenAI API gets embeddings via POST to this endpoint '/embeddings' (https://platform.openai.com/docs/api-reference/embeddings)
But the locally hosted llamafile server gets embeddings via this endpoint '/embedding' (https://github.com/Mozilla-Ocho/llamafile/blob/main/llama.cpp/server/README.md#api-endpoints)
This results in a File Not Found API error when trying to add documents and embed text on a locally hosted LLM.
Llamafile is currently on version 0.6.2 which was last synchronized with llama.cpp on 1-27-2024. However the llama.cpp commit that fixes this problem wasn't committed until 1-29-2024 (https://github.com/ggerganov/llama.cpp/pull/5190/commits/94613299b4d52ab0f79b3eaa6c870c883d470863). So until the next version of llamafile syncs to a llama.cpp sync after that date paperqa is not compatible with a llamafile I believe.
Hi @alexanderchang1 - maybe try using SentenceTransformer?
from paperqa import Docs, OpenAILLMModel, print_callback
docs = Docs(client=client,
embedding="sentence-transformers",
llm_result_callback=print_callback,
llm_model=OpenAILLMModel(config=dict(model="my-llama.cpp-llm", temperature=0.1, max_tokens=512)))
or I added a new embedding that is just keyword based:
from paperqa import Docs, OpenAILLMModel, print_callback
docs = Docs(client=client,
embedding="sparse",
llm_result_callback=print_callback,
llm_model=OpenAILLMModel(config=dict(model="my-llama.cpp-llm", temperature=0.1, max_tokens=512)))
Hi @whitead,
Still getting the File Not Found error due to /embeddings and /embedding conflicts in llamafile.
Hi, @whitead i got it to work with the following method.
llamafile servers are currently incompatible and result in File Note Found error due to endpoint differences.
Instead, a user has to install the latest llama-cpp-python bindings for web server (https://github.com/abetlen/llama-cpp-python#web-server), and then run the command locally.
python -m llama_cpp.server --model ./models/llama-2-7b.Q5_K_M.gguf --n_gpu_layers 35 --port 8010
Then you can run it locally via
from paperqa import Docs, OpenAILLMModel, print_callback, LlamaEmbeddingModel, SentenceTransformerEmbeddingModel
from openai import AsyncOpenAI
client = AsyncOpenAI(
base_url="http://localhost:8010/v1",
api_key = "sk-no-key-required"
)
docs = Docs(client=client,
embedding="sentence-transformers",
llm_result_callback=print_callback,
llm_model=OpenAILLMModel(config=dict(model="my-llama.cpp-llm", temperature=0.1, max_tokens=512)))
And the remainder of the code is the same, however the performance drops a lot using sentence transformers. Is there an updated version on how to use LlamaEmbeddingModel instead? Or any other model, my hope is to eventually use mistral 8x7B.
Hello everyone, we have just released version 5, which completely outsources all LLM management to BerriAI/litellm.
If your issue persists, please reopen a new issue using paper-qa>=5