getappmap / navie-benchmark

Navie benchmarks
MIT License
0 stars 0 forks source link

include and exclude patterns should be more specific #89

Closed kgilpin closed 3 weeks ago

kgilpin commented 3 weeks ago

/exclude=test excludes all files in the pytest project, resulting in no search matches.

Use a more specific pattern, like:

test_regexp_patterns = [
    r"\\testing\\",
    r"\\tests\\",
    r"\\test\\",
    r"(.*?\\test_).*\.py",
    r"(.*?\\).*_test\.py",
]
github-actions[bot] commented 3 weeks ago

Title

Refine Include and Exclude Patterns for Test Files

Problem

The current configuration for excluding test files in the pytest project is overly broad. The exclusion pattern /exclude=test is excluding all files and causing no search matches, which is not the intended behavior. The patterns need to be refined to more precisely exclude only the test files without affecting other files.

Analysis

Excluding files using a catch-all pattern like /exclude=test can result in unintended consequences, such as the exclusion of non-test files that happen to contain the word "test" in their path. This necessitates the use of more precise regular expression patterns to accurately target the intended test files. By specifying patterns that better match the structure of test file paths, we can ensure appropriate files are included or excluded.

The problem can be addressed by adopting a set of regular expression patterns that specifically identify common test file naming conventions and directory structures. This typically includes paths like "testing/", directories named "tests" or "test", and filenames that either start with "test_" or end with "_test.py".

Proposed Changes

  1. solver/harness/python_version.py: Since the problem might involve how the test harness interprets and applies these filters, the changes could include updating the include/exclude handling to incorporate more specific regex patterns. Ensure the harness functions use refined patterns for file exclusions.

  2. solver/workflow/patch.py: Modify the exclusion logic to handle a list of detailed patterns rather than relying on a simple match string. This involves updating the functions that apply these filters (clean_patch, exclude_files) to accept and utilize regex patterns.

  3. solver/tests/workflow/test_filter_patches_to_tests.py: Ensure tests adequately reflect the improvements in pattern accuracy by using a variety of sample_patch and sample_diff inputs. Verify that the pattern logic accurately distinguishes between test and non-test files based on the newly introduced regex patterns.

  4. Regular Expression Pattern List:

    • Introduce a configuration point in the system where such patterns reside, allowing ease of maintenance. Consider an array like test_regexp_patterns to specify these patterns.
    • Update any documentation or configuration guides to reflect the more detailed pattern usage for better clarity and future-proofing.

By implementing these changes, file exclusion/inclusion processes will become more accurate, preventing the accidental omission of valid files while effectively targeting tests for inclusion or exclusion.