getappmap / navie-benchmark

Navie benchmarks
MIT License
0 stars 0 forks source link

Sonnet output mixes up path prefixes #85

Open kgilpin opened 2 weeks ago

kgilpin commented 2 weeks ago

Here's an answer to a choose_test_file query:

Based on the code snippets and problem description, here are the 3 most relevant files that are likely the root causes of the issue with unnecessary joins in admin changelist queries:

/home/runner/work/navie-benchmark/navie-benchmark/solve/django__django-15022/source/django/contrib/admin/views/main.py
/home/runner/work/navie-benchmark/navie-benchmark/solve/django/db/models/fields/__init__.py
/home/runner/work/navie-benchmark/navie-benchmark/solve/django/contrib/admin/views/main.py

The issue stems from how the admin changelist handles search queries and joins, particularly in the ChangeList class and related field handling. The main.py file contains the core logic for processing admin changelist queries and applying filters/searches, while fields/__init__.py defines the field types and their behavior that affects how joins are generated.

In the search log, all the code snippets have the correct instance prefix /home/runner/work/navie-benchmark/navie-benchmark/solve/django__django-15022:

[code-snippet] /home/runner/work/navie-benchmark/navie-benchmark/solve/django__django-15022/source/django/db/models/fields/__init__.py

But in the LLM output, the path elements django__django-15022/source have been stripped out.

django__django-15022.zip

github-actions[bot] commented 2 weeks ago

Title: Fix Path Stripping in Sonnet Output for File Query Responses

Problem: When generating responses for a choose_test_file query, the Large Language Model (LLM) is incorrectly stripping necessary path elements (django__django-15022/source) from file paths, leading to incomplete or incorrect paths being displayed in the response. This results in output that lacks crucial directory prefixes, causing confusion when attempting to locate or reference files.

Analysis:

Proposed Changes:

  1. Response Generation Logic:

    • Review and modify the logic in the response generation system to ensure that it correctly retains necessary path components like django__django-15022/source.
    • Identify and correct any rules or functions responsible for indiscriminate path prefix removal that are leading to this stripping issue.
  2. Path Handling Module:

    • Introduce or adjust a path normalization function that intelligently determines which path components are important and should be preserved.
    • Ensure that any path normalization logic clearly distinguishes between directory names that are part of the legitimate path structure versus those meant for stripping.
  3. Testing and Validation:

    • Create test scenarios mimicking the path structures seen in the issue (django__django-15022/source) to systematically ensure that necessary components are retained in the output.
    • Validate the output to confirm that the full path is present and correct without any missing directory prefixes that existed in the search logs.
  4. Documentation Update:

    • Update any documentation or guidelines that relate to the path processing or response generation logic to reflect the corrected approach to path handling, ensuring that all relevant prefixes are maintained.

By implementing these changes, the response generation system will reliably output file paths that users can directly reference, ensuring an accurate and consistent user experience.

dividedmind commented 2 weeks ago

You mean the plan output? I think it's a losing battle, trying to get the model to always use absolute paths. We might want to find a way to handle relative paths instead.