Open kgilpin opened 1 month ago
Accept TestStatus.ERROR
for Inverted Test Case if the Marker Error Message __BUG__HERE__
is Observed
The test harness currently discards inverted patches that result in a TestStatus.ERROR
even when the specific marker error message __BUG__HERE__
is observed. This is leading to potentially valid inverted patches being overlooked.
The test in question deliberately raises an Exception
with a marker error message __BUG__HERE__
to indicate the presence of a bug. During execution, the test harness identifies this as an ERROR
and consequently, the inverted patch is discarded since the harness only accepts patches that result in FAILED
status. To accept such patches, the system needs to identify and process the ERROR
status containing the specific marker error message.
The distinction between FAILED
and ERROR
:
FAILED
: Test case ran to completion and the outcome did not meet expectations.ERROR
: Test case encountered an exception that it could not handle, halting execution.In this scenario, an ERROR
due to the marker __BUG__HERE__
is intentional and should be considered a valid outcome for accepting the inverted patch.
File: solver/workflow/execute_container.py
parse_test_status
to check for the presence of the marker error message __BUG__HERE__
when the status is ERROR
.File: solver/workflow/execute_container.py
TestStatus.ERROR
when the marker error message __BUG__HERE__
is observed.solver/workflow/execute_container.py
: Modify parse_test_status
function:
__BUG__HERE__
when parsing the test status.solver/workflow/execute_container.py
: Update status evaluation logic
TestStatus.ERROR
is accepted if the marker error message is found in the test output.solver/workflow/execute_container.py
: Modify parse_test_status
function
def parse_test_status(log, repo: str, test_output: Optional[str]) -> TestStatus:
log_parser = MAP_REPO_TO_PARSER[repo]
if not log_parser:
raise ValueError(f"No log parser found for repo {repo}")
test_status_dict: dict[str, str] = {}
marker_error_present = False
if test_output:
try:
parsed_status = log_parser(test_output)
if parsed_status:
test_status_dict.update(parsed_status)
if "__BUG__HERE__" in test_output:
marker_error_present = True
except Exception as e:
log("parse-test-status", f"Failed to parse test status: {e}")
log("parse-test-status", f"Test output: {test_output}")
if test_status_dict:
test_status_str = ", ".join(
f"{test_name}: {status}" for test_name, status in test_status_dict.items()
)
log(
"parse-test-status",
f"Test status: {test_status_str}",
)
def any_status(status_name: str) -> bool:
return any(
status for status in test_status_dict.values() if status == status_name
)
if any_status(TestStatus.ERROR.value) and marker_error_present:
test_status = TestStatus.ERROR
elif any_status(TestStatus.ERROR.value):
test_status = TestStatus.ERROR
elif any_status(TestStatus.FAILED.value):
test_status = TestStatus.FAILED
else:
test_status = TestStatus.PASSED
log("parse-test-status", f"Overall test status: {test_status}")
return test_status
Modify existing logic to account for the acceptance of TestStatus.ERROR
when __BUG__HERE__
is detected.
By incorporating these changes, the test harness will be able to accept inverted patches that throw TestStatus.ERROR
as long as the marker error message __BUG__HERE__
is observed. This will prevent valid patches from being discarded erroneously.
In the following scenario, the test case is resolved as
ERROR
by the test harness.As a result, the inverted patch is discarded, because the current logic only accepts
FAILED
test status.Investigate why this test is determined to be an
ERROR
by the test harness. IfERROR
is the correct status, consider accepting inverted patches that have statusERROR
and have the marker error__BUG__HERE__
.Test patch