langchain-ai / langchain-google

MIT License
74 stars 78 forks source link

[Breaking] change the behavior of ChatVertexAI.with_structured_output when specifying a dict #333

Open kiarina opened 3 days ago

kiarina commented 3 days ago

The behavior of ChatVertexAI.with_structured_output when specifying a dict was different from ChatOpenAI and ChatAnthropicVertex, so it has been corrected.

This is considered breaking as it changes existing functionality. If this change is unacceptable, please feel free to close it.

lkuligin commented 3 days ago

asking @efriis for the second opinion :), but the change looks reasonable to me. thanks for the PR!

baskaryan commented 3 days ago

I think this will also break cases where a protobuf is passed in, which might be a useful feature (to match what vertex api provides). Will take a closer look when I'm back from vacation next week, and @baskaryan might have some takes before then!

The common abstraction currently includes pydantic BaseModel

The question seems to be whether we want to have the common abstraction also include:

  1. Whatever the provider API supports by default
  2. JSON Schema (OpenAI/Anthropic)

In general I'm more supportive of 1 (and closing this), and @baskaryan may have different input!

re: breaking support for protobuf schemas, agree we should avoid this. could we either explicitly check the type of the schema before calling convert_to_openai_function, or else try/except that conversion. and when the schema cannot be converted to openai format we keep current behavior

kiarina commented 2 days ago

re: breaking support for protobuf schemas, agree we should avoid this. could we either explicitly check the type of the schema before calling convert_to_openai_function, or else try/except that conversion. and when the schema cannot be converted to openai format we keep current behavior

        elif isinstance(schema, dict) and all(
            k in schema for k in ("title", "description", "properties")
        ):
            schema = convert_to_openai_function(schema)
            parser = JsonOutputKeyToolsParser(
                key_name=schema["name"], first_tool_only=True
            )
        else:
            parser = JsonOutputToolsParser()

I've made it so that the behavior changes only when a dict containing title, description, and properties is passed. How does this look?