Open pdhoolia opened 3 months ago
Great question!
High level, i don't think it's good practice to expose polymorphic functions directly to LLMs. It's more confusing to the LLM to use kwargs since you're giving even more ambiguity to the LLM that will almost definitely reduce reliability.
Mechanistically, tools are called using invoke()
or ainvoke()
(for async), not as functions. That means you'd call like
query_data.invoke({"query_type": "foo", "kwargs": {"ey", "yo"}})
and the you'd get a literal 'kwargs'-keyed keyword arg you would have to de-nest here. We could potentially look into adding more native support for kwargs
, though that would be directed on the langchain-core repo.
But again, if I were trying to get good results from an LLM I would avoid overloading 6 tools in one here and make it explicit what it's supposed to provide.
im going to move this to the langchain repo, but i do think it could be nice to think if we can improve support for this...
High level, i don't think it's good practice to expose polymorphic functions directly to LLMs. It's more confusing to the LLM to use kwargs since you're giving even more ambiguity to the LLM that will almost definitely reduce reliability.
I agree, and for a quick proof-of-concept that was what I was going to do. Before doing that however, I asked the LLM what it would do (with a prompt like):
I want to write a generic python function that i can use with Open AI function_call support to perform all kinds of queries with this data.
Ironically, LLM led with that and settled on:
def query_data(query_type, **kwargs):
"""
Query the employee data with various filters.
Args:
query_type (str): The type of query to perform. Must be one of the following:
- "get_by_name": Retrieve entries based on the employee's name.
- "get_by_email": Retrieve entries based on the employee's email.
- "get_by_department": Retrieve entries based on the department name.
- "get_by_manager": Retrieve entries based on the manager's ID.
- "get_by_managerName": Retrieve entries based on the manager's name.
- "get_by_userId": Retrieve entries based on the employee's user ID.
**kwargs: Additional arguments specific to the query type.
- name (str, optional): The name of the employee to search for (required for "get_by_name").
- email (str, optional): The email of the employee to search for (required for "get_by_email").
- departmentName (str, optional): The name of the department to search for (required for "get_by_department").
- managerId (str, optional): The ID of the manager to search for (required for "get_by_manager" and "get_direct_reports").
- managerName (str, optional): The name of the manager to search for (required for "get_by_managerName").
- userId (str, optional): The user ID of the employee to search for (required for "get_by_userId").
Returns:
list: A list of dictionary entries from the data that match the query criteria.
dict: If an error occurs, returns a dictionary with an error message.
Examples:
>>> query_data(query_type="get_by_name", name="Ann Lau")
Returns all entries with the name "Ann Lau".
>>> query_data(query_type="get_by_email", email="Ann.Lau@bestrunsap.com")
Returns the entry with the email "Ann.Lau@bestrunsap.com".
>>> query_data(query_type="get_by_department", departmentName="Human Resources US")
Returns all entries within the "Human Resources US" department.
>>> query_data(query_type="get_by_manager", managerId="82092")
Returns all entries managed by the manager with ID "82092".
>>> query_data(query_type="get_by_managerName", managerName="Ann Lau")
Returns all entries managed by the manager named "Ann Lau".
"""
...
Also, the other thing is that generic lookup APIs like that are quite common, and while they could be function wrapped in a lot of de-polymorphing functions. That could be quite some work everytime we encounter such an enterprise API.
Checked other resources
Example Code
I have a simple query tool (over a json) as follows:
If i run this function in isolation using something like:
It works as expected.
However, when the tool is being called by langgraph. It seems to return
[]
. It seems there are no kwargs received.In the studio I see the following
The tool call message is correct
However the tool response in next message is empty.
Also my log for the
query_data
function is:I think something related to
kwargs
here may be causing the problem.Description
I am expecting the tool_call to return a result. But it seems kwargs part of the tool signature is somehow getting lost.
System Info
pip freeze | grep langchain
platform: mac python version: i have tried with both 3.11 and 3.12. Same results.