Closed leotronolone closed 2 months ago
š¤
Thank you for the detailed bug report, leotronolone. It seems like the issue is related to the get_parameters_dict
method in the ToolMetadata
class. This method is responsible for generating the JSON schema for the function parameters. In the latest version of LlamaIndex, the method is filtering out keys that are not in the list ["type", "properties", "required", "definitions"]
.
Here is the relevant code snippet:
parameters = self.fn_schema.schema()
parameters = {
k: v
for k, v in parameters.items()
if k in ["type", "properties", "required", "definitions"]
}
This means that any additional keys in the schema, such as those representing referenced objects and enums, are being removed. This is likely the cause of the LLM returning function call inputs that don't respect the desired JSON schema.
To fix this issue, you could modify the get_parameters_dict
method to include these additional keys in the schema. However, without knowing the exact structure of your schemas and how they are being used, it's hard to provide a specific solution.
Unfortunately, I wasn't able to find the exact changes made to the get_parameters_dict
method between version 0.9.14 and 0.9.40 of LlamaIndex. It would be helpful if you could provide more information about the changes you've noticed in the behavior of this method between these versions.
You can find the source code for the ToolMetadata
class in the llama_index/tools/types.py file.
I hope this helps! If you have any more questions or need further clarification, feel free to ask.
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a š if this is helpful and š if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
FYI I was able to make a workaround where I defined the enum/object's schema in the description for the field.
class CalculateFunctionModel(BaseModel):
math_operator: MATH_OPERATOR = Field(description=f"Possible MATH_OPERATOR vales: {[a.value for a in MATH_OPERATOR]}")
valueA: int
valueB: int
This workaround only works for the first layer of referenced object schemas though(which is enough for my usecase). Fixing this would still be greatly appreciated as I'm sure my usecase will need this eventually š
Thank you for sharing your workaround, leotronolone! We appreciate your feedback and understanding of the issue. We'll go ahead and close this issue now. If you have any more questions or need further assistance in the future, feel free to reach out.
To continue the conversation, mention @dosu.
@dosu Can we reopen this? The underlying issue has still not been resolved
@logan-markewich Can you please assist with reopening this issue? The user has indicated that the underlying issue still needs attention. Thank you!
To continue the conversation, mention @dosu.
@leotronolone pretty sure this is solved in latest versions
@logan-markewich This workaround was required for 0.10.33 (latest version as of today)
@leotronolone looks fine to me, but I think you are defining the function tool in an unexpected way
https://colab.research.google.com/drive/1GEt_dkEXnnVkygBqtwYnFfMOAvr78OKi?usp=sharing
Bug Description
I am no longer able to use referenced objects/enums within function call inputs on the latest version of llamaindex(0.9.40). This is caused by the referenced objects no longer being added to the json schema sent to the LLM when defining the function calls. These json schemas were being generated from the fn_schema input(which extends the BaseModel) from FunctionTool.fromDefaults() previously.
EX: LlamaIndex Latest version: '{"properties": {"ask": {"description": "natural language query string", "title": "Ask", "type": "string"}, "entity_types": {"description": "The types of entities to search for answer", "items": {"$ref": "#/$defs/EntityType"}, "title": "Entity Types", "type": "array"}, "required": ["ask", "entity_types"], "type": "object"}'
LlamaIndex Old Version: "{'$defs': {'EntityType': {'enum': ['file', 'response', 'member'], 'title': 'EntityType', 'type': 'string'}}, 'properties': {'ask': {'description': 'natural language query string', 'title': 'Ask', 'type': 'string'}, 'entity_types': {'description': 'The types of entities to search for answer', 'items': {'$ref': '#/$defs/EntityType'}, 'title': 'Entity Types', 'type': 'array'}}, 'required': ['ask', 'entity_types'], 'title': 'AskQuestionGPTFunctionModel', 'type': 'object'}"
This is causing the LLM to return function call inputs that don't respect the desired json schema, causing an exception when it tries to call the function using a bad input value.
EX: it will try calling the AskQuestion function with entity_types = ['document'] instead of one of the valid values of ["file", "response", "member"]
This was working for me on 0.9.14. I tried updating pydantic and openai to the latest version as well but had no luck
Version
0.9.15+
Steps to Reproduce
Run the following example code:
Relevant Logs/Tracbacks