tawada / grass-grower

0 stars 0 forks source link

Enhance error handling in `github_utils.py` by incorporating exit code analysis and structured output parsing for improved robustness and localization support. #76

Open tawada opened 2 months ago

tawada commented 2 months ago

The provided codebase for automating issue handling on GitHub is well-structured and covers a wide range of functionalities, including interaction with GitHub issues, automated code generation, logging, and robust argument parsing. However, upon a detailed review, one area that stands out for improvement concerns error handling, specifically in the context of the services/github/github_utils.py module.

In the exec_command function within the github_utils.py module, exceptions are caught, and specific exception classes are raised based on the error message contained within the subprocess's CalledProcessError. This mechanism relies heavily on the presence of certain keywords in the error message to map to predefined exceptions. This approach, while functional, has a couple of drawbacks:

  1. Brittleness to Error Message Changes: The current design assumes that certain keywords will always be present in error messages for specific types of failures. However, error messages could change over time due to updates in the tools being invoked (git, the shell, etc.). This could lead to the incorrect mapping of errors or potentially missing the detection of some errors entirely.

  2. Limited Localization Support: Relying on the content of error messages makes this tool less effective in environments where tools might not output error messages in English. This limits the internationalization potential of the tool.

To address these concerns, a more robust error handling mechanism could be implemented. Instead of relying solely on the content of error messages, consider augmenting the system with the following strategies:

Implementing these suggestions will lead to more resilient and maintainable error handling, ultimately enhancing the robustness of the tool in handling a wider variety of error conditions and operational environments.

tawada commented 2 months ago

Please be specific, showing the code before the change and the code after the proposed change

tawada commented 2 months ago

To enhance the error handling mechanisms in the services/github/github_utils.py module for improved robustness and localization support, the following changes are proposed. These changes aim to mitigate the brittleness associated with error message changes and the limited localization support by incorporating exit code analysis and structured error output parsing.

Original exec_command Function:

def exec_command(repo: str,
                 command: list[str],
                 capture_output: bool = False) -> subprocess.CompletedProcess:
    """
    Execute a shell command within the specified git repository path.
    """
    repo_path = os.path.join(DEFAULT_PATH, repo)
    try:
        complete_process = subprocess.run(
            command,
            stdout=subprocess.PIPE if capture_output else None,
            stderr=subprocess.PIPE if capture_output else None,
            cwd=repo_path,
            check=True,
        )
        return complete_process
    except subprocess.CalledProcessError as err:
        log(
            f"Command {command} failed with error ({err.returncode}): {err}",
            level="error",
        )
        exception, error_message = exceptions.parse_exception(err)
        raise exception(error_message) from err

Proposed Changes:

  1. Exit Code Analysis: Analyze the exit codes and map them to predefined exceptions. This reduces the reliance on error message content.

  2. Structured Output Parsing: This example assumes git commands have a structure that can be parsed more reliably than plain text errors (as an illustrative example; actual implementation will vary based on the output provided by the tools in use).

  3. Forcing Locale for Subprocess: Force the locale of the subprocess to ensure error messages are in English, enabling reliable parsing.

import json

def exec_command(repo: str,
                 command: list[str],
                 capture_output: bool = False,
                 expected_exit_codes: list[int] = [0]) -> subprocess.CompletedProcess:
    """
    Execute a shell command within the specified git repository path.
    Force the locale to English for consistent error message parsing.
    """
    command_with_locale = ['env', 'LC_ALL=en_US.UTF-8'] + command  # Force English error messages
    repo_path = os.path.join(DEFAULT_PATH, repo)
    try:
        complete_process = subprocess.run(
            command_with_locale,
            stdout=subprocess.PIPE if capture_output else None,
            stderr=subprocess.PIPE if capture_output else None,
            cwd=repo_path,
            check=False,  # Don't raise error immediately on non-zero exit
        )
        if complete_process.returncode in expected_exit_codes:
            return complete_process
        else:
            # Attempt to parse structured error message if available
            try:
                err_data = json.loads(complete_process.stderr)
                error_code = err_data.get('error_code', 'UNKNOWN_ERROR')
                raise exceptions.ErrorFromCode(error_code)  # Custom function to map error codes to exceptions
            except json.JSONDecodeError:
                # Fallback if output is not JSON-parsable
                raise exceptions.CommandExecutionException(f"Command {command} failed with exit code {complete_process.returncode}")
    except subprocess.CalledProcessError as err:
        log(f"Command {command} failed: {err}", level="error")
        raise  # Re-raise enhanced exception

Key Aspects of the Changes:

Implementing these changes would enhance maintainability and robustness, especially concerning internationalization and evolving tool error messages.