Run Coverage Programmatically to Generate Tests for Untested Code

benhoyle commented 10 months ago

Is your feature request related to a problem? Please describe. A code base should have coverage at 100%. We can use the python coverage tool to do this. After code and tests have been autogenerated we want to check they cover all of the code base and generate new tests for any missing lines.

Desired Output A method that runs pytest with coverage and generates tests for any uncovered lines.

Code to Edit None so far

Code to Add New methods to:

Run pytest with coverage and get results in a programmatically accessible form (e.g., in a Python dict)
Get the programmatic results and iterate through
Get one entry in the programmatic results and generate a test to cover the missing lines

Useful Existing Code The generate_tests_from_db - we can use similar functions to this (or re-use existing function for test generation and saving)

Additional context Add any other context or screenshots about the feature request here.

benhoyle commented 10 months ago

We can install and run coverage, and create a JSON file called coverage.json in the project root:

import subprocess

def run_pytest_coverage():
    # Run pytest with coverage
    pytest_command = ["coverage", "run", "-m", "pytest"]
    subprocess.run(pytest_command, check=True)

    # Generate JSON coverage report
    coverage_json_command = ["coverage", "json"]
    subprocess.run(coverage_json_command, check=True)

if __name__ == "__main__":
    run_pytest_coverage()

benhoyle commented 10 months ago

This gives us an output like this:

benhoyle commented 10 months ago

We can store the line numbers when extracting functions and then cross reference these:

We can adapt this suggested ChatGPT solution:

import ast
import coverage

def get_function_ranges(file_path):
    with open(file_path, "r") as file:
        tree = ast.parse(file.read(), filename=file_path)

    function_ranges = {}
    for node in ast.walk(tree):
        if isinstance(node, ast.FunctionDef):
            function_ranges[node.name] = (node.lineno, node.end_lineno)

    return function_ranges

def function_coverage(function_ranges, executed_lines, missing_lines):
    function_coverage = {}
    for function, (start, end) in function_ranges.items():
        executed = [line for line in range(start, end + 1) if line in executed_lines]
        missing = [line for line in range(start, end + 1) if line in missing_lines]
        function_coverage[function] = {"executed": executed, "missing": missing}

    return function_coverage

# Example usage
file_path = "your_python_file.py"
cov = coverage.Coverage()
cov.load()

executed_lines = cov.get_data().lines(file_path)
missing_lines = cov.get_data().missing_lines(file_path)

function_ranges = get_function_ranges(file_path)
coverage_by_function = function_coverage(function_ranges, executed_lines, missing_lines)

# Now `coverage_by_function` contains the coverage data for each function

benhoyle commented 10 months ago

Prompt for revising test:

Here is a function:
{$SELECTION}

Here is its test:
{test_code}

The function runs from line {start_line} to line {end_line}. The following lines are reported as untested by coverage: [{missing_lines}]

Can you amend the test to add assertions that cover the untested lines?

benhoyle commented 10 months ago

General agent process:

(update/populate DB if not up to date)
Run code_management.coverage_checker.run_coverage_and_update_db
Switch onto a new temporary branch - git_management.git_handler.GitHandler.create_temp_test_branch
Run code_management.coverage_checker.revise_tests_to_increase_coverage

benhoyle commented 10 months ago

Need to edit prompts to get multiline strings returned as bracketed strings:

Simibrum / code-assistant

Run Coverage Programmatically to Generate Tests for Untested Code #18