Open boyanxu opened 3 days ago
To be honest I am not sure what's happening here. It could be something due to using async clients. Anyway, we are planning to switch from TOOL mode to JSON schema mode which hopefully should fix the issue (see here for example: https://github.com/instructor-ai/instructor/issues/840)
Thanks for the kind reply. Anyway, impressive work. I'll keep an eye on the updates.
Probably a bug with this line. (The OllamaAILLMService was recently removed though)
maybe setting the Mode to just JSON might fix it.
self.llm_async_client = instructor.from_openai(
ollama_client,
mode=instructor.Mode.JSON
)
Yes, I removed that so that we don't have to support two things (the openai interface is the same, just different base_url). Indeed we will try the JSON schema thing, we just need to transform all the answers in json (a couple are plain strings as of know). I should be able to do that tomorrow or the day after :)
The custom LLM implementation is running without errors, but it fails to produce any output. During the extraction process, the following validation errors occur:
Error during information extraction from document: 3 validation errors for Model
entities
Field required [type=missing, input_value={'Model': {'entities': [{...y7 and OtherEntity8'}]}}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/missing
relationships
Field required [type=missing, input_value={'Model': {'entities': [{...y7 and OtherEntity8'}]}}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/missing
other_relationships
Field required [type=missing, input_value={'Model': {'entities': [{...y7 and OtherEntity8'}]}}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/missing
"""Example usage of GraphRAG with custom LLM and Embedding services compatible with the OpenAI API."""
from typing import List
from dotenv import load_dotenv
from fast_graphrag import GraphRAG
from fast_graphrag._llm import OpenAIEmbeddingService, OpenAILLMService
load_dotenv()
DOMAIN = "Analyze this topic and identify the core concepts. Focus on how they interact with each other, the topics they explore, and their relationships."
QUERIES: List[str] = [
"What is transformer?",
"How transformers work?"
]
ENTITY_TYPES: List[str] = ["Topic", "Concept"]
working_dir = "./examples/transformer"
grag = GraphRAG(
working_dir=working_dir,
domain=DOMAIN,
example_queries="\n".join(QUERIES),
entity_types=ENTITY_TYPES,
config=GraphRAG.Config(
llm_service=OpenAILLMService(model="mistral", base_url="http://localhost:11434/v1", api_key="your-api-key"),
embedding_service=OpenAIEmbeddingService(
model="nomic-ai/nomic-embed-text-v1.5-GGUF",
base_url="http://localhost:8001/v1",
api_key="your-api-key",
embedding_dim=768, # the output embedding dim of the chosen model
),
),
)
with open("./examples/transformer/Transformers_intro.txt") as f:
grag.insert(f.read())
print(grag.query("Who is transformer?").response)
def __post_init__(self):
logger.debug("Initialized OpenAILLMService with patched OpenAI client.")
self.llm_async_client: instructor.AsyncInstructor = instructor.from_openai(
AsyncOpenAI(base_url=self.base_url, api_key=self.api_key, timeout=TIMEOUT_SECONDS),
mode=instructor.Mode.JSON,
)
Ref : https://python.useinstructor.com/examples/ollama/?h=ollama#ollama
The OpenAI Python client doesn't work properly with the embedding API of Ollama. I created a small utility to work around this limitation. You can find the utility here.
Any guidance on how to handle the LLM model input to resolve the issue would be greatly appreciated.
Describe the bug The Instructor library is throwing an error when attempting to process multiple tool calls. The specific error is "Instructor does not support multiple tool calls, use List[Model] instead".
To Reproduce
Error