Open xinpeng-zhang opened 1 week ago
The solution for parsing the correct JSON format from models that don’t support function calls has been pretty unstable, so I’ve added a new mode to use ReAct as the format in completion messages. So far, I’ve only verified the solution on Qwen 2.5 Coder 32B Instruct. Also, just a heads-up: actions for editing code now use an XML format instead of JSON.
Here's how to enable it when running evaluations for example:
completion_model = CompletionModel(
model=model,
temperature=0.0,
max_tokens=2000,
model_api_key=api_key,
model_base_url=base_url,
response_format=LLMResponseFormat.REACT
)
tree_search_settings = TreeSearchSettings(
max_iterations=30,
max_expansions=1,
agent_message_history_type=MessageHistoryType.REACT,
model=completion_model
)
evaluation = Evaluation(
evaluations_dir="/path/to/eval_dir",
evaluation_name="20241124_Qwen2.5-Coder-32B-Instruct",
num_workers=1,
use_testbed=True,
repo_base_dir="/tmp/repos",
settings=tree_search_settings,
dataset_name="princeton-nlp/SWE-bench_Lite",
)
Cool, I will give it a try. Thank you!
Full JSON schema: {'$defs': {'ChangeType': {'enum': ['addition', 'modification', 'deletion'], 'title': 'ChangeType', 'type': 'string'}, 'CodeSpan': {'properties': {'file_path': {'description': 'The file path where the relevant code is found.', 'title': 'File Path', 'type': 'string'}, 'start_line': {'anyOf': [{'type': 'integer'}, {'type': 'null'}], 'default': None, 'description': 'The start line of the code to add to context.', 'title': 'Start Line'}, 'end_line': {'anyOf': [{'type': 'integer'}, {'type': 'null'}], 'default': None, 'description': 'The end line of the code to add to context.', 'title': 'End Line'}, 'span_ids': {'description': "Span IDs identiying the relevant code spans. A span id is a unique identifier for a code sippet. It can be a class name or function name. For functions in classes separete with a dot like 'class.function'.", 'items': {'type': 'string'}, 'title': 'Span Ids', 'type': 'array'}}, 'required': ['file_path'], 'title': 'CodeSpan', 'type': 'object'}, 'FindClassArgs': {'description': 'Find a specific class in the codebase.', 'properties': {'scratch_pad': {'description': 'Your reasoning for the action.', 'title': 'Scratch Pad', 'type': 'string'}, 'file_pattern': {'anyOf': [{'type': 'string'}, {'type': 'null'}], 'default': None, 'description': 'A glob pattern to filter search results to specific files or directories.', 'title': 'File Pattern'}, 'class_name': {'description': 'Specific class name to include in the search.', 'title': 'Class Name', 'type': 'string'}}, 'required': ['scratch_pad', 'class_name'], 'title': 'FindClass', 'type': 'object'}, 'FindCodeSnippetArgs': {'description': 'Request to search for an exact code snippet.', 'properties': {'scratch_pad': {'description': 'Your reasoning for the action.', 'title': 'Scratch Pad', 'type': 'string'}, 'file_pattern': {'anyOf': [{'type': 'string'}, {'type': 'null'}], 'default': None, 'description': 'A glob pattern to filter search results to specific file types or directories. ', 'title': 'File Pattern'}, 'code_snippet': {'description': 'The exact code snippet to find.', 'title': 'Code Snippet', 'type': 'string'}}, 'required': ['scratch_pad', 'code_snippet'], 'title': 'FindCodeSnippet', 'type': 'object'}, 'FindFunctionArgs': {'description': 'Search for a specific function or class in the codebase.', 'properties': {'scratch_pad': {'description': 'Your reasoning for the action.', 'title': 'Scratch Pad', 'type': 'string'}, 'file_pattern': {'anyOf': [{'type': 'string'}, {'type': 'null'}], 'default': None, 'description': 'A glob pattern to filter search results to specific files or directories.', 'title': 'File Pattern'}, 'function_name': {'description': 'Specific function names to include in the search.', 'title': 'Function Name', 'type': 'string'}, 'class_name': {'anyOf': [{'type': 'string'}, {'type': 'null'}], 'default': None, 'description': 'Specific class name to include in the search.', 'title': 'Class Name'}}, 'required': ['scratch_pad', 'function_name'], 'title': 'FindFunction', 'type': 'object'}, 'FinishArgs': {'description': 'Indicate that the task is fully completed.', 'properties': {'scratch_pad': {'description': 'Your reasoning about why the task is complete.', 'title': 'Scratch Pad', 'type': 'string'}, 'finish_reason': {'description': 'Explanation of completion.', 'title': 'Finish Reason', 'type': 'string'}}, 'required': ['scratch_pad', 'finish_reason'], 'title': 'Finish', 'type': 'object'}, 'RejectArgs': {'description': 'Reject the task and explain why.', 'properties': {'scratch_pad': {'description': 'Your reasoning for the action.', 'title': 'Scratch Pad', 'type': 'string'}, 'rejection_reason': {'description': 'Explanation for rejection.', 'title': 'Rejection Reason', 'type': 'string'}}, 'required': ['scratch_pad', 'rejection_reason'], 'title': 'Reject', 'type': 'object'}, 'RequestCodeChangeArgs': {'description': 'Apply a code change through an AI agent. This action instructs an AI assistant to \nmodify code based on provided instructions and pseudo-code. The AI will analyze the existing code within \nthe specified line range and apply changes while maintaining proper syntax, indentation, and context.', 'properties': {'scratch_pad': {'description': 'Your reasoning for the action.', 'title': 'Scratch Pad', 'type': 'string'}, 'file_path': {'description': 'The file path of the code to be updated.', 'title': 'File Path', 'type': 'string'}, 'instructions': {'description': 'Natural language instructions for the AI assistant describing the required code changes.', 'title': 'Instructions', 'type': 'string'}, 'pseudo_code': {'description': 'Example code snippet illustrating the desired changes. The AI will use this as a reference for implementing the modifications.', 'title': 'Pseudo Code', 'type': 'string'}, 'change_type': {'$ref': '#/$defs/ChangeType', 'description': "Type of change to perform: 'addition' (insert new code), 'modification' (update existing code), or 'deletion' (remove code)."}, 'start_line': {'description': 'The line number where the code change should begin. For additions, specifies the insertion point.', 'title': 'Start Line', 'type': 'integer'}, 'end_line': {'description': 'The line number where the code change should end. For additions, specifies the insertion point.', 'title': 'End Line', 'type': 'integer'}}, 'required': ['scratch_pad', 'file_path', 'instructions', 'pseudo_code', 'change_type', 'start_line', 'end_line'], 'title': 'RequestCodeChange', 'type': 'object'}, 'RequestMoreContextArgs': {'description': 'Request additional code spans to be added to your current context.', 'properties': {'scratch_pad': {'description': 'Your thoughts on the code change.', 'title': 'Scratch Pad', 'type': 'string'}, 'files': {'description': 'The code that should be provided in the file context.', 'items': {'$ref': '#/$defs/CodeSpan'}, 'title': 'Files', 'type': 'array'}}, 'required': ['scratch_pad', 'files'], 'title': 'RequestMoreContext', 'type': 'object'}, 'SemanticSearchArgs': {'description': 'Search for code snippets based on semantic similarity.', 'properties': {'scratch_pad': {'description': 'Your reasoning for the action.', 'title': 'Scratch Pad', 'type': 'string'}, 'file_pattern': {'anyOf': [{'type': 'string'}, {'type': 'null'}], 'default': None, 'description': 'A glob pattern to filter search results to specific files or directories.', 'title': 'File Pattern'}, 'query': {'description': "Natural language description of what you're looking for.", 'title': 'Query', 'type': 'string'}, 'category': {'anyOf': [{'type': 'string'}, {'type': 'null'}], 'default': None, 'description': "The category of files to search for. This can be 'implementation' for core implementation files or 'test' for test files.", 'title': 'Category'}}, 'required': ['scratch_pad', 'query'], 'title': 'SemanticSearch', 'type': 'object'}}, 'properties': {'action': {'anyOf': [{'$ref': '#/$defs/FindClassArgs'}, {'$ref': '#/$defs/FindFunctionArgs'}, {'$ref': '#/$defs/FindCodeSnippetArgs'}, {'$ref': '#/$defs/SemanticSearchArgs'}, {'$ref': '#/$defs/RequestMoreContextArgs'}, {'$ref': '#/$defs/RequestCodeChangeArgs'}, {'$ref': '#/$defs/FinishArgs'}, {'$ref': '#/$defs/RejectArgs'}], 'title': 'Action'}, 'action_type': {'description': 'The type of action being taken', 'title': 'Action Type', 'type': 'string'}}, 'required': ['action', 'action_type'], 'title': 'TakeAction', 'type': 'object'}
this the openaischema that i print out, when parsing the json below, it raise, key error startline { "action": { "scratch_pad": "The has_key method in the FileBasedCache class has been modified to handle FileNotFoundError, addressing the race condition issue.", "finish_reason": "The race condition in the has_key method has been resolved by adding appropriate error handling." }, "action_type": "Finish" }