getappmap / navie-benchmark

Navie benchmarks
MIT License
0 stars 0 forks source link

Strip fences from test and code list output #86

Closed kgilpin closed 3 weeks ago

kgilpin commented 3 weeks ago

LLM output for choose_test_files may include fences that we want to strip out:

2024-10-29 19:34:36,318 - INFO - [choose-test-file] (astropy__astropy-13398) Found no existing test files in Based on the context, I'll identify the 3 most relevant test files for this issue involving ITRS to Observed transformations. Here they are in order of relevance:

1. `/home/runner/work/navie-benchmark/navie-benchmark/solve/astropy__astropy-13398/source/astropy/coordinates/tests/test_intermediate_transformations.py`
- Contains key test cases for transformations between coordinate systems
- Has the critical `test_straight_overhead()` function mentioned in the issue
- Tests ITRS<->GCRS, CIRS<->ITRS and other intermediate transformations
- Most comprehensive test file for coordinate transformations

2. `/home/runner/work/navie-benchmark/navie-benchmark/solve/astropy__astropy-13398/source/astropy/coordinates/tests/test_regression.py`
- Contains regression tests for coordinate-related bugs
- Tests AltAz transformations with EarthLocation
- Has tests for ITRS transformations and edge cases
- Important for ensuring backwards compatibility

3. `/home/runner/work/navie-benchmark/navie-benchmark/solve/astropy__astropy-13398/source/astropy/coordinates/tests/test_icrs_observed_transformations.py`
- Specifically tests ICRS<->AltAz transformations
- Tests consistency between different transformation paths
- Relevant for validating the new direct ITRS<->AltAz approach
- Contains tests for HADec transformations as well

These files would be most important for testing and validating the proposed direct ITRS to Observed transformations approach.

Note that /home/runner/work/navie-benchmark/navie-benchmark/solve/astropy__astropy-13398/source/astropy/coordinates/tests/test_intermediate_transformations.py is the correct test file to modify, based on the dataset results.

github-actions[bot] commented 3 weeks ago

Title: Strip Fences from LLM Output in choose_test_files

Problem: The output from the choose_test_files function includes unnecessary notation or fences (e.g., list item markers or comments) around file paths in its results. The presence of these fences can increase the difficulty of parsing and using these paths programmatically. The desired outcome is to have clean, unadorned file paths that can be easily utilized in subsequent operations.

Analysis: The current output of the choose_test_files function contains file paths enclosed with list indentations or bullets. This formatting may have been intended for human-readable logs, but when these outputs are consumed programmatically by other parts of the system or for display purposes that require raw paths, the fences or bullets become extraneous and cumbersome. To resolve this, a solution is required that extracts and sanitizes these file paths, thus making them readily available for programmatic use.

The current implementation might rely on standard logging or representation functions which prepend additional characters for formatting. The objective here is to preprocess these entries such that these fences are removed, leaving only the essential file paths.

Proposed Changes:

  1. solver/workflow/generate_test.py:

    • Introduce a function that processes the output from the choose_test_files logic before it is displayed or logged.
    • This function should iterate over each line of the string output and extract file paths by identifying patterns that match typical path structures.
    • Remove or clean up any inline notation attached to the file paths.
  2. Function Logic (within generate_test.py):

    • Use regex pattern matching to differentiate actual file paths from extraneous text.
    • Strip the start of each file path of any unwanted prefix that identifies it as a list item or fenced section.
    • Construct a new list of paths that are clean and free from the fences, returning this as the modified output of the choose_test_files function.

By implementing these changes, the output of choose_test_files will be more structured and devoid of any unnecessary characters around the paths, thus making it more syntactically useful for system integrations.

dividedmind commented 3 weeks ago

I'm not sure what you mean; the output you included doesn't contain any fences. Do you mean the backticks used to quote the paths? Perhaps it would be more robust to ask for structured output to choose test files.