mlouage / AdventOfCode.2024

Advent of Code for 2024 - This time in Go
0 stars 0 forks source link

Day 04 Refactor #8

Open mlouage opened 6 hours ago

mlouage commented 6 hours ago

I'll review your Go code and provide feedback to help you write more idiomatic Go. Here are several points to consider:

  1. Error Handling:
    • In Go, it's common to handle errors immediately rather than letting them propagate. In your createPuzzle function, you have multiple error checks that could be combined:
func createPuzzle(ctx context.Context, filename string) (*PuzzleProcessor, error) {
    numLines, err := countLines(filename)
    if err != nil {
        return nil, fmt.Errorf("counting lines: %w", err)
    }
    // This opens the file again, which could be avoided
    numChar, err := countCharacters(filename)
    if err != nil {
        return nil, fmt.Errorf("counting characters: %w", err)
    }

Could be refactored to read the file once and get both counts, following the principle of "don't repeat yourself" (DRY).

  1. Struct Methods:

    • The method hasXmas is quite long and could benefit from being broken down into smaller, more focused methods
    • The naming convention in Go for boolean methods typically starts with "Is", "Has", "Can", etc. Your isValid and isValidX follow this convention well!
  2. Constructor Pattern:

    • Your NewPuzzleProcessor follows the Go convention of using New prefix for constructors, which is good
    • However, you could make it more robust by validating the input parameters:
func NewPuzzleProcessor(lines, chars int) (*PuzzleProcessor, error) {
    if lines <= 0 || chars <= 0 {
        return nil, fmt.Errorf("invalid dimensions: lines=%d, chars=%d", lines, chars)
    }
    // ... rest of the code
}
  1. Boolean Expression Simplification:
    • In isValid, you could simplify the return statement:
func (pp *PuzzleProcessor) isValid(row int, col int) bool {
    return row >= 0 && row <= pp.rows-1 && col >= 0 && col <= pp.cols-1
}
  1. Parameter Grouping:
    • For methods that take multiple parameters of the same type, you can group them:
func (pp *PuzzleProcessor) isValid(row, col int) bool
  1. Long Boolean Expressions:
    • In hasMasX, the boolean expression is quite long and hard to read. Consider breaking it down into smaller, more readable parts:
func (pp *PuzzleProcessor) hasMasX(row, col int) bool {
    if !pp.isValidX(row, col) {
        return false
    }

    type pattern struct {
        m1, m2 position
        s1, s2 position
    }

    patterns := []pattern{
        {{1, 1}, {-1, 1}, {-1, -1}, {1, -1}},
        // ... other patterns
    }

    for _, p := range patterns {
        if pp.data[row+p.m1.row][col+p.m1.col] == "M" &&
           pp.data[row+p.m2.row][col+p.m2.col] == "M" &&
           pp.data[row+p.s1.row][col+p.s1.col] == "S" &&
           pp.data[row+p.s2.row][col+p.s2.col] == "S" {
            return true
        }
    }
    return false
}
  1. File Operations:

    • You're correctly using defer file.Close() after opening files
    • Consider using os.ReadFile for simple file reading operations instead of manually managing the file handle
  2. Context Usage:

    • Good use of context handling in the file processing loop
    • Consider adding context timeout handling in your tests
  3. Constants:

    • Consider defining constants for repeated values like "X", "M", "A", "S"
    • This makes the code more maintainable and less prone to typos
  4. Documentation:

    • Consider adding package documentation and godoc-style comments for exported functions and types
    • Example:
// PuzzleProcessor handles the processing of XMAS puzzle data.
// It maintains a grid of characters and provides methods for analyzing patterns.
type PuzzleProcessor struct {
    // ...
}

Overall, your code shows good understanding of Go concepts, but could benefit from some idiomatic Go patterns and simplifications. The error handling is generally good, and you're following many Go conventions already. The main areas for improvement are around code organization, simplification of complex boolean expressions, and potential performance optimizations around file operations.

Would you like me to elaborate on any of these points or provide more specific examples for any particular aspect?

mlouage commented 6 hours ago

I'll show you how to refactor the file operations to be more efficient by reading the file only once. Here's how we can modify the code:

func createPuzzle(ctx context.Context, filename string) (*PuzzleProcessor, error) {
    // Read file once
    file, err := os.Open(filename)
    if err != nil {
        return nil, fmt.Errorf("opening file: %w", err)
    }
    defer file.Close()

    // First pass to count lines and get line length
    var numLines int
    var numChar int
    scanner := bufio.NewScanner(file)

    // Get first line to determine width
    if scanner.Scan() {
        firstLine := scanner.Text()
        numChar = len(firstLine)
        numLines = 1 // Count the first line
    } else {
        if err := scanner.Err(); err != nil {
            return nil, fmt.Errorf("reading first line: %w", err)
        }
        return nil, fmt.Errorf("file is empty")
    }

    // Count remaining lines
    for scanner.Scan() {
        select {
        case <-ctx.Done():
            return nil, ctx.Err()
        default:
            numLines++
            // Optional: Verify all lines have same length
            if len(scanner.Text()) != numChar {
                return nil, fmt.Errorf("inconsistent line length at line %d", numLines)
            }
        }
    }
    if err := scanner.Err(); err != nil {
        return nil, fmt.Errorf("scanning file: %w", err)
    }

    // Create processor with known dimensions
    processor := NewPuzzleProcessor(numLines, numChar)

    // Reset file for second pass
    if _, err := file.Seek(0, 0); err != nil {
        return nil, fmt.Errorf("resetting file position: %w", err)
    }

    // Second pass to populate the data
    scanner = bufio.NewScanner(file)
    i := 0
    for scanner.Scan() {
        select {
        case <-ctx.Done():
            return nil, ctx.Err()
        default:
            line := scanner.Text()
            for j, value := range line {
                processor.data[i][j] = string(value)
            }
            i++
        }
    }
    if err := scanner.Err(); err != nil {
        return nil, fmt.Errorf("scanning file: %w", err)
    }

    return processor, nil
}

Key improvements in this version:

  1. We only open the file once and use defer file.Close() to ensure it's properly closed
  2. We use a single scanner to both count lines and get the character count
  3. We verify line length consistency (optional but good for data validation)
  4. We use file.Seek(0, 0) to reset the file position instead of reopening the file
  5. Error messages are more specific about where the error occurred
  6. Context cancellation is still properly handled

This removes the need for separate countLines and countCharacters functions. However, if those functions are used elsewhere in your codebase, you might want to keep them but modify createPuzzle to use this more efficient approach.

You could also make this even more memory efficient by using a single pass if you know the maximum possible dimensions in advance:

func createPuzzle(ctx context.Context, filename string) (*PuzzleProcessor, error) {
    file, err := os.Open(filename)
    if err != nil {
        return nil, fmt.Errorf("opening file: %w", err)
    }
    defer file.Close()

    // Create a scanner with a larger buffer if needed
    scanner := bufio.NewScanner(file)

    // Read first line to get width
    if !scanner.Scan() {
        if err := scanner.Err(); err != nil {
            return nil, fmt.Errorf("reading first line: %w", err)
        }
        return nil, fmt.Errorf("file is empty")
    }

    firstLine := scanner.Text()
    numChar := len(firstLine)

    // Pre-allocate with some reasonable capacity
    rows := make([][]string, 0, 1000) // Adjust capacity based on expected size

    // Process first line
    rows = append(rows, make([]string, numChar))
    for j, value := range firstLine {
        rows[0][j] = string(value)
    }

    // Process remaining lines
    for scanner.Scan() {
        select {
        case <-ctx.Done():
            return nil, ctx.Err()
        default:
            line := scanner.Text()
            if len(line) != numChar {
                return nil, fmt.Errorf("inconsistent line length at line %d", len(rows)+1)
            }

            newRow := make([]string, numChar)
            for j, value := range line {
                newRow[j] = string(value)
            }
            rows = append(rows, newRow)
        }
    }
    if err := scanner.Err(); err != nil {
        return nil, fmt.Errorf("scanning file: %w", err)
    }

    return &PuzzleProcessor{
        data: rows,
        rows: len(rows),
        cols: numChar,
    }, nil
}

This single-pass version:

  1. Only reads the file once
  2. Dynamically grows the slice as needed
  3. Still maintains all error checking and context cancellation
  4. Might use less memory since we don't need to pre-allocate the full size

Choose the version that best fits your needs based on whether you know the maximum size in advance and your performance requirements.