langchain-ai / langchain-google

MIT License
76 stars 79 forks source link

Invalid request for structured output #343

Open tomasonjo opened 4 days ago

tomasonjo commented 4 days ago

I am trying to extract information using this code:

from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_core.documents import Document
from langchain_core.pydantic_v1 import BaseModel, Field

from typing import Optional

llm = ChatGoogleGenerativeAI(model='gemini-1.5-pro')
from langchain_core.prompts import ChatPromptTemplate

system_prompt = (
    "# Knowledge Graph Instructions for GPT-4\n"
    "## 1. Overview\n"
    "You are a top-tier algorithm designed for extracting information in structured "
    "formats to build a knowledge graph.\n"
    "Try to capture as much information from the text as possible without "
    "sacrificing accuracy. Do not add any information that is not explicitly "
    "mentioned in the text.\n"
    "- **Nodes** represent entities and concepts.\n"
    "- The aim is to achieve simplicity and clarity in the knowledge graph, making it\n"
    "accessible for a vast audience.\n"
    "## 2. Labeling Nodes\n"
    "- **Consistency**: Ensure you use available types for node labels.\n"
    "Ensure you use basic or elementary types for node labels.\n"
    "- For example, when you identify an entity representing a person, "
    "always label it as **'person'**. Avoid using more specific terms "
    "like 'mathematician' or 'scientist'."
    "- **Node IDs**: Never utilize integers as node IDs. Node IDs should be "
    "names or human-readable identifiers found in the text.\n"
    "- **Relationships** represent connections between entities or concepts.\n"
    "Ensure consistency and generality in relationship types when constructing "
    "knowledge graphs. Instead of using specific and momentary types "
    "such as 'BECAME_PROFESSOR', use more general and timeless relationship types "
    "like 'PROFESSOR'. Make sure to use general and timeless relationship types!\n"
    "## 3. Coreference Resolution\n"
    "- **Maintain Entity Consistency**: When extracting entities, it's vital to "
    "ensure consistency.\n"
    'If an entity, such as "John Doe", is mentioned multiple times in the text '
    'but is referred to by different names or pronouns (e.g., "Joe", "he"),'
    "always use the most complete identifier for that entity throughout the "
    'knowledge graph. In this example, use "John Doe" as the entity ID.\n'
    "Remember, the knowledge graph should be coherent and easily understandable, "
    "so maintaining consistency in entity references is crucial.\n"
    "## 4. Strict Compliance\n"
    "Adhere to the rules strictly. Non-compliance will result in termination."
)

default_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            system_prompt,
        ),
        (
            "human",
            (
                "Tip: Make sure to answer in the correct format and do "
                "not include any explanations. "
                "Use the given format to extract information from the "
                "following input: {input}"
            ),
        ),
    ]
)

class Node(BaseModel):
    """Represents a node in a graph with associated properties.
    """
    id: str = Field(description="A unique identifier for the node.")
    type: str = Field(description="The type or label of the node.")

class Relationship(BaseModel):
    """Represents a directed relationship between two nodes in a graph.
    """

    source: Node = Field(description="The source node of the relationship.")
    target: Node = Field(description="The target node of the relationship.")
    type: str = Field(description="The type of the relationship.")

class Graph(BaseModel):
    """Represents a graph document consisting of nodes and relationships.
    """

    nodes: Optional[List[Node]] = Field(description="List of nodes")
    relationships: Optional[List[Relationship]] = Field(description="List of relationships")

structured_llm = llm.with_structured_output(Graph)
chain = default_prompt | structured_llm

chain.invoke({"input": "What is going on?"})

But I get this error:

_InactiveRpcError                         Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/google/api_core/grpc_helpers.py](https://localhost:8080/#) in error_remapped_callable(*args, **kwargs)
     71         try:
---> 72             return callable_(*args, **kwargs)
     73         except grpc.RpcError as exc:

26 frames
_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
    status = StatusCode.INVALID_ARGUMENT
    details = "Request contains an invalid argument."
    debug_error_string = "UNKNOWN:Error received from peer ipv4:74.125.201.95:443 {created_time:"2024-07-01T10:36:39.099639512+00:00", grpc_status:3, grpc_message:"Request contains an invalid argument."}"
>

The above exception was the direct cause of the following exception:

InvalidArgument                           Traceback (most recent call last)
InvalidArgument: 400 Request contains an invalid argument.

The above exception was the direct cause of the following exception:

ChatGoogleGenerativeAIError               Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/langchain_google_genai/chat_models.py](https://localhost:8080/#) in _chat_with_retry(**kwargs)
    188 
    189         except google.api_core.exceptions.InvalidArgument as e:
--> 190             raise ChatGoogleGenerativeAIError(
    191                 f"Invalid argument provided to Gemini: {e}"
    192             ) from e

ChatGoogleGenerativeAIError: Invalid argument provided to Gemini: 400 Request contains an invalid argument.