Open ToyHugs opened 4 months ago
Same issue with a simple structured output.
Like in the langGraph tutorial, I am trying to use a List
in with_structured_output
with ChatGoogleGenerativeAI.
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.pydantic_v1 import BaseModel, Field
from typing import List
class Plan(BaseModel):
steps: List[str] = Field(
description="different steps to follow, should be in sorted order"
)
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.2, verbose=True).with_structured_output(Plan)
print(model.invoke("what is the hometown of the current Australia open winner?"))
And the error :
raise ChatGoogleGenerativeAIError(
langchain_google_genai.chat_models.ChatGoogleGenerativeAIError: Invalid argument provided to Gemini: 400 * GenerateContentRequest.tools[0].function_declarations[0].parameters.properties[setup].items: missing field.
Without the List
it works like a charm
I am also having the same issue, just like @Mikatux, I am using ChatGoogleGenerativeAI
with the tutorial code, and have the same error. Without List, it works as well.
Same issue. However, in my case this works unreliably when my schema inherits from BaseModel and does not works at all whenever I try to pass a TypedDict-based output model. In the documentation it's said that it must work...
Same here. If I have a list parameter in my tool input model I receive the same error.
same here . Is there an workaround for this?
Yes, the same me.
I had a similar issue. After playing around for a while I found the following solutions. Alternative 1: Naive approach Update your prompt with an additional line like "the output should be a JSON object with 'entities' as key with the list of names populated as list. No Preamble" Alternative 2:
from langchain_core.utils.function_calling import convert_to_openai_function
dict_schema = convert_to_openai_function(Entities)
entity_chain = prompt | llm.with_structured_output(dict_schema)
Alternative 2 worked for me. Reference. Note: My LLM is ChatGoogleGenerativeAI not ChatVertexAI, but it still works!
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.pydantic_v1 import BaseModel, Field
class Plan(BaseModel):
steps: str = Field(
description="different steps to follow, should be in sorted order"
)
model = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.2, verbose=True).with_structured_output(Plan)
print(model.invoke("what is the hometown of the current Australia open winner?"))
This gets me None
using convert_to_openai_function
gets me []
instead.
with_structured_output
seems broken for ChatGoogleGenerativeAI
?
Holy damn I just realized that this is just because the model just don't want to answer. see this code
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.pydantic_v1 import BaseModel, Field
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash", temperature=0.2, verbose=True)
class Plan(BaseModel):
'''Plan to be aswesome'''
steps: str = Field(description="different steps to follow to be awesome")
model = llm.with_structured_output(Plan)
print(model.invoke("what is the hometown of the current Australia open winner?")) # <--- This returns None !!!!!!!!
print(model.invoke("How should I be awesome?")) # <--- This returns something
Is there a setting to force the model to answer? lazy model?
My two cents:
libraries like langchain/litellm/etc are not fast enough to cope with google/openai sdk/api updates. I migrated this part of my code to use vanilla google gemini api and had much better results.
That said, I think it is a hard problem to solve ( unify the apis in such a fast changing api landscape)
Any one fix it somehow?
I switched to Vertex AI which works for me, but it's not ideal given different quotas / rate limits between the two services.
from langchain_google_vertexai import ChatVertexAI
from pydantic import BaseModel, Field
class VertexOutputModel(BaseModel):
age: int = Field(..., description="Age of user")
vertex_llm = ChatVertexAI(
model="gemini-1.5-flash",
temperature=0,
max_output_tokens=2048,
stream=False,
).with_structured_output(VertexOutputModel)
I get another error with a similar example:
Exception has occurred: TypeError
list indices must be integers or slices, not str
File "/path/test.py", line XXX, in test
.with_structured_output(schema=TestModel2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<string>", line 1, in <module>
TypeError: list indices must be integers or slices, not str
Test example pseudocode:
class TestModel1(BaseModel):
test: str = Field(description="Test")
class TestModel2(BaseModel):
test: list[TestModel1]
Any updates here?
Any updates here?
@baskaryan any luck?
from enum import Enum
from pydantic import BaseModel, Field
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())
from langchain_core.messages import HumanMessage
from langchain_anthropic import ChatAnthropic
from langchain_google_genai import ChatGoogleGenerativeAI
class TestModel1(BaseModel):
"""Test attribute definition"""
test: str = Field(description="Test")
class TestModel2(BaseModel):
"""Extracts a list of test attributes"""
test: list[TestModel1]
model_anthropic = ChatAnthropic(model='claude-3-5-sonnet-latest', temperature=0) \
.with_structured_output(schema=TestModel2.model_json_schema())
model_google = ChatGoogleGenerativeAI(model='models/gemini-1.5-pro-latest', temperature=0) \
.with_structured_output(schema=TestModel2.model_json_schema())
query = """Test1, Test2, Test3"""
messages = [HumanMessage(query)]
response_anthropic = model_anthropic.invoke(input=messages)
response_google = model_google.invoke(input=messages)
print(response_anthropic)
print(response_google)
{'test': [{'test': 'Test1'}, {'test': 'Test2'}, {'test': 'Test3'}]}
[{'args': {}, 'type': 'TestModel2'}]
model_google = ChatGoogleGenerativeAI(model='models/gemini-1.5-pro-latest', temperature=0) \
.with_structured_output(schema=TestModel2, include_raw=True)
Value 'Test attribute definition' is not supported in schema, ignoring v=Test attribute definition
Value '['test']' is not supported in schema, ignoring v=['test']
Value 'TestModel1' is not supported in schema, ignoring v=TestModel1
Value 'object' is not supported in schema, ignoring v=object
Value 'Test attribute definition' is not supported in schema, ignoring v=Test attribute definition
Value '['test']' is not supported in schema, ignoring v=['test']
Value 'TestModel1' is not supported in schema, ignoring v=TestModel1
Value 'object' is not supported in schema, ignoring v=object
{'name': 'TestModel2', 'description': 'Extracts a list of test attributes', 'parameters': {'type_': 6, 'properties': {'test': {'type_': 5, 'items': {'type_': 1, 'format_': '', 'description': '', 'nullable': False, 'enum': [], 'max_items': '0', 'min_items': '0', 'properties': {}, 'required': []}, 'format_': '', 'description': '', 'nullable': False, 'enum': [], 'max_items': '0', 'min_items': '0', 'properties': {}, 'required': []}}, 'required': ['test'], 'format_': '', 'description': '', 'nullable': False, 'enum': [], 'max_items': '0', 'min_items': '0'}}
{'raw': AIMessage(content='', additional_kwargs={'function_call': {'name': 'TestModel2', 'arguments': '{"test": ["Test1", "Test2", "Test3"]}'}}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-a2b4c849-ee1b-4220-b16a-e0133fc4523e-0', tool_calls=[{'name': 'TestModel2', 'args': {'test': ['Test1', 'Test2', 'Test3']}, 'id': 'b6aa2cdb-087d-4560-b4d9-3849d6abb371', 'type': 'tool_call'}], usage_metadata={'input_tokens': 50, 'output_tokens': 24, 'total_tokens': 74}), 'parsing_error': 3 validation errors for TestModel2
test.0
Input should be a valid dictionary or instance of TestModel1 [type=model_type, input_value='Test1', input_type=str]
For further information visit https://errors.pydantic.dev/2.9/v/model_type
test.1
Input should be a valid dictionary or instance of TestModel1 [type=model_type, input_value='Test2', input_type=str]
For further information visit https://errors.pydantic.dev/2.9/v/model_type
test.2
Input should be a valid dictionary or instance of TestModel1 [type=model_type, input_value='Test3', input_type=str]
For further information visit https://errors.pydantic.dev/2.9/v/model_type, 'parsed': None}
model_google = ChatGoogleGenerativeAI(model='models/gemini-1.5-pro-latest', temperature=0) \
.with_structured_output(schema=TestModel2.model_json_schema(), include_raw=True)
{'$defs': {'TestModel1': {'description': 'Test attribute definition', 'properties': {'test': {'description': 'Test', 'title': 'Test', 'type': 'string'}}, 'required': ['test'], 'title': 'TestModel1', 'type': 'object'}}, 'description': 'Extracts a list of test attributes', 'properties': {'test': {'items': {'$ref': '#/$defs/TestModel1'}, 'title': 'Test', 'type': 'array'}}, 'required': ['test'], 'title': 'TestModel2', 'type': 'object', 'parameters': {}}
{'$defs': {'TestModel1': {'description': 'Test attribute definition', 'properties': {'test': {'description': 'Test', 'title': 'Test', 'type': 'string'}}, 'required': ['test'], 'title': 'TestModel1', 'type': 'object'}}, 'description': 'Extracts a list of test attributes', 'properties': {'test': {'items': {'$ref': '#/$defs/TestModel1'}, 'title': 'Test', 'type': 'array'}}, 'required': ['test'], 'title': 'TestModel2', 'type': 'object', 'parameters': {}}
{'name': 'TestModel2', 'description': 'Extracts a list of test attributes', 'parameters': {}}
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1730140541.769303 2729666 fork_posix.cc:77] Other threads are currently calling into gRPC, skipping fork() handlers
{'raw': AIMessage(content='', additional_kwargs={'function_call': {'name': 'TestModel2', 'arguments': '{}'}}, response_metadata={'prompt_feedback': {'block_reason': 0, 'safety_ratings': []}, 'finish_reason': 'STOP', 'safety_ratings': [{'category': 'HARM_CATEGORY_DANGEROUS_CONTENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HARASSMENT', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_HATE_SPEECH', 'probability': 'NEGLIGIBLE', 'blocked': False}, {'category': 'HARM_CATEGORY_SEXUALLY_EXPLICIT', 'probability': 'NEGLIGIBLE', 'blocked': False}]}, id='run-d841bf55-f7ec-453b-a958-a0ccbeffe9ae-0', tool_calls=[{'name': 'TestModel2', 'args': {}, 'id': 'db62faf6-299e-4331-9f16-059876ccc819', 'type': 'tool_call'}], usage_metadata={'input_tokens': 37, 'output_tokens': 10, 'total_tokens': 47}), 'parsed': [{'args': {}, 'type': 'TestModel2'}], 'parsing_error': None}
However, this code "kind-of" works:
model_google = ChatGoogleGenerativeAI(model='models/gemini-1.5-pro-latest', temperature=0) \
.bind_tools([TestModel2], tool_choice='any')
...
print(response_google.additional_kwargs['function_call']['arguments'])
{"test": ["Test1", "Test2", "Test3"]}
Reference: https://github.com/langchain-ai/langchain-google/pull/469
There seems to be a workaround:
from langchain_core.utils.function_calling import convert_to_openai_function
model_google = ChatGoogleGenerativeAI(model='models/gemini-1.5-pro-latest', temperature=0) \
.with_structured_output(schema=convert_to_openai_function(TestModel2))
Reference: https://github.com/langchain-ai/langchain-google/issues/299
Checked other resources
Example Code
Collab link : https://colab.research.google.com/drive/1BCat5tBZRcxUhjQ3vGJD3Zu1eiqYIAWz?usp=sharing Code :
Error Message and Stack Trace (if applicable)
InvalidArgument Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/langchain_google_genai/chat_models.py in _chat_with_retry(kwargs) 177 try: --> 178 return generation_method(kwargs) 179 # Do not retry for these errors.
25 frames /usr/local/lib/python3.10/dist-packages/google/ai/generativelanguage_v1beta/services/generative_service/client.py in generate_content(self, request, model, contents, retry, timeout, metadata) 826 # Send the request. --> 827 response = rpc( 828 request,
/usr/local/lib/python3.10/dist-packages/google/api_core/gapic_v1/method.py in call(self, timeout, retry, compression, *args, *kwargs) 130 --> 131 return wrapped_func(args, **kwargs) 132
/usr/local/lib/python3.10/dist-packages/google/api_core/retry/retry_unary.py in retry_wrapped_func(*args, **kwargs) 292 ) --> 293 return retry_target( 294 target,
/usr/local/lib/python3.10/dist-packages/google/api_core/retry/retry_unary.py in retry_target(target, predicate, sleep_generator, timeout, on_error, exception_factory, **kwargs) 152 # defer to shared logic for handling errors --> 153 _retry_error_helper( 154 exc,
/usr/local/lib/python3.10/dist-packages/google/api_core/retry/retry_base.py in _retry_error_helper(exc, deadline, next_sleep, error_list, predicate_fn, on_error_fn, exc_factory_fn, original_timeout) 211 ) --> 212 raise final_exc from source_exc 213 if on_error_fn is not None:
/usr/local/lib/python3.10/dist-packages/google/api_core/retry/retry_unary.py in retry_target(target, predicate, sleep_generator, timeout, on_error, exception_factory, **kwargs) 143 try: --> 144 result = target() 145 if inspect.isawaitable(result):
/usr/local/lib/python3.10/dist-packages/google/api_core/timeout.py in func_with_timeout(*args, *kwargs) 119 --> 120 return func(args, **kwargs) 121
/usr/local/lib/python3.10/dist-packages/google/api_core/grpc_helpers.py in error_remapped_callable(*args, **kwargs) 80 except grpc.RpcError as exc: ---> 81 raise exceptions.from_grpc_error(exc) from exc 82
InvalidArgument: 400 * GenerateContentRequest.tools[0].function_declarations[0].parameters.properties[key_developments].items: missing field.
The above exception was the direct cause of the following exception:
ChatGoogleGenerativeAIError Traceback (most recent call last) in <cell line: 1>()
----> 1 results = rag_extractor.invoke("Key developments associated with cars")
/usr/local/lib/python3.10/dist-packages/langchain_core/runnables/base.py in invoke(self, input, config, kwargs) 2794 input = step.invoke(input, config, kwargs) 2795 else: -> 2796 input = step.invoke(input, config) 2797 # finish the root run 2798 except BaseException as e:
/usr/local/lib/python3.10/dist-packages/langchain_core/runnables/base.py in invoke(self, input, config, kwargs) 4976 kwargs: Optional[Any], 4977 ) -> Output: -> 4978 return self.bound.invoke( 4979 input, 4980 self._merge_configs(config),
/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py in invoke(self, input, config, stop, **kwargs) 263 return cast( 264 ChatGeneration, --> 265 self.generate_prompt( 266 [self._convert_input(input)], 267 stop=stop,
/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py in generate_prompt(self, prompts, stop, callbacks, kwargs) 696 ) -> LLMResult: 697 prompt_messages = [p.to_messages() for p in prompts] --> 698 return self.generate(prompt_messages, stop=stop, callbacks=callbacks, kwargs) 699 700 async def agenerate_prompt(
/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py in generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs) 553 if run_managers: 554 run_managers[i].on_llm_error(e, response=LLMResult(generations=[])) --> 555 raise e 556 flattened_outputs = [ 557 LLMResult(generations=[res.generations], llm_output=res.llm_output) # type: ignore[list-item]
/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py in generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs) 543 try: 544 results.append( --> 545 self._generate_with_cache( 546 m, 547 stop=stop,
/usr/local/lib/python3.10/dist-packages/langchain_core/language_models/chat_models.py in _generate_with_cache(self, messages, stop, run_manager, kwargs) 768 else: 769 if inspect.signature(self._generate).parameters.get("run_manager"): --> 770 result = self._generate( 771 messages, stop=stop, run_manager=run_manager, kwargs 772 )
/usr/local/lib/python3.10/dist-packages/langchain_google_genai/chat_models.py in _generate(self, messages, stop, run_manager, tools, functions, safety_settings, tool_config, generation_config, kwargs) 765 generation_config=generation_config, 766 ) --> 767 response: GenerateContentResponse = _chat_with_retry( 768 request=request, 769 kwargs,
/usr/local/lib/python3.10/dist-packages/langchain_google_genai/chat_models.py in _chat_with_retry(generation_method, kwargs) 194 raise e 195 --> 196 return _chat_with_retry(kwargs) 197 198
/usr/local/lib/python3.10/dist-packages/tenacity/init.py in wrapped_f(*args, kw) 334 copy = self.copy() 335 wrapped_f.statistics = copy.statistics # type: ignore[attr-defined] --> 336 return copy(f, *args, *kw) 337 338 def retry_with(args: t.Any, kwargs: t.Any) -> WrappedFn:
/usr/local/lib/python3.10/dist-packages/tenacity/init.py in call(self, fn, *args, **kwargs) 473 retry_state = RetryCallState(retry_object=self, fn=fn, args=args, kwargs=kwargs) 474 while True: --> 475 do = self.iter(retry_state=retry_state) 476 if isinstance(do, DoAttempt): 477 try:
/usr/local/lib/python3.10/dist-packages/tenacity/init.py in iter(self, retry_state) 374 result = None 375 for action in self.iter_state.actions: --> 376 result = action(retry_state) 377 return result 378
/usr/local/lib/python3.10/dist-packages/tenacity/init.py in(rs)
396 def _post_retry_check_actions(self, retry_state: "RetryCallState") -> None:
397 if not (self.iter_state.is_explicit_retry or self.iter_state.retry_run_result):
--> 398 self._add_action_func(lambda rs: rs.outcome.result())
399 return
400
/usr/lib/python3.10/concurrent/futures/_base.py in result(self, timeout) 449 raise CancelledError() 450 elif self._state == FINISHED: --> 451 return self.__get_result() 452 453 self._condition.wait(timeout)
/usr/lib/python3.10/concurrent/futures/_base.py in __get_result(self) 401 if self._exception: 402 try: --> 403 raise self._exception 404 finally: 405 # Break a reference cycle with the exception in self._exception
/usr/local/lib/python3.10/dist-packages/tenacity/init.py in call(self, fn, *args, *kwargs) 476 if isinstance(do, DoAttempt): 477 try: --> 478 result = fn(args, **kwargs) 479 except BaseException: # noqa: B902 480 retry_state.set_exception(sys.exc_info()) # type: ignore[arg-type]
/usr/local/lib/python3.10/dist-packages/langchain_google_genai/chat_models.py in _chat_with_retry(**kwargs) 188 189 except google.api_core.exceptions.InvalidArgument as e: --> 190 raise ChatGoogleGenerativeAIError( 191 f"Invalid argument provided to Gemini: {e}" 192 ) from e
ChatGoogleGenerativeAIError: Invalid argument provided to Gemini: 400 * GenerateContentRequest.tools[0].function_declarations[0].parameters.properties[key_developments].items: missing field.
Description
Hi !
Since yesterday, I try to follow this official guide in the v0.2 documentation : https://python.langchain.com/v0.2/docs/how_to/extraction_long_text/
However, it doesn't work well with Chat Google Generative AI The collab link is here, if you want to try : https://colab.research.google.com/drive/1BCat5tBZRcxUhjQ3vGJD3Zu1eiqYIAWz?usp=sharing
I have followed the guide step by step, but it keep having an error about missing field on the request. For information, Chat Google Generative AI have Structured Output : https://python.langchain.com/v0.2/docs/integrations/chat/google_generative_ai/ And also, it's not about my location either (I have already success for others use of Chat Google Generative AI)
I have try differents things with schema, and I go to the conclusion that I can't use scheme that define other scheme in it like (or List):
However I can use without problem this scheme :
(but responses with scheme tend to have very bad result with Chat Google, like it's 90% time non-sense)
Sorry for my english which is not really perfect and thank you for reading me !
System Info
https://colab.research.google.com/drive/1BCat5tBZRcxUhjQ3vGJD3Zu1eiqYIAWz?usp=sharing