oss-slu / bubble_scan

GNU General Public License v3.0
0 stars 7 forks source link

Created a testing infrastructure for our scantron processing algorithm #88

Closed ramezmosad closed 1 month ago

ramezmosad commented 1 month ago

Fixes #80

What was changed?

I have added comprehensive unit and integration tests for the Scantron processing algorithm, specifically targeting the Scantron95945 class. These tests are written using the Pytest framework and cover key functions such as extractImagesFromPdf, align_image, crop_roi, get_responses_bubble_row, bubble_column, student_id, and the overall integration of these components. The new test scripts aim to validate the correctness, reliability, and robustness of the algorithm under various conditions.

Why was it changed?

The change was made to address the need for thorough testing of the Scantron processing algorithm to ensure its reliability and accuracy. The algorithm performs critical tasks like extracting images from PDFs, aligning scantron images, detecting filled bubbles, and generating JSON outputs with student results. By implementing these tests, we aim to catch potential bugs, handle edge cases (such as corrupted PDFs, distorted images, and varying fill levels of bubbles), and ensure that the system can handle real-world scenarios without errors.

How was it changed?

I added several test classes and functions within the tests directory to cover both unit and integration testing:

Unit Tests:

  1. test_align.py:

    • Tests the align_image function for correct image alignment and handling cases where keypoints are missing.
    • Verifies that images are correctly aligned when features are present.
    • Checks that the function returns the original image when keypoints are absent.
  2. test_bubble_detection.py:

    • Tests get_responses_bubble_row and bubble_column functions for accurate bubble detection under various fill levels.
    • Ensures single and multiple filled bubbles are detected correctly.
    • Validates behavior when no bubbles are filled.
  3. test_crop_ROI.py:

    • Tests the crop_roi function to ensure ROIs are correctly cropped and extracted.
    • Validates proper cropping when all markers are present.
    • Uses synthetic images with markers to simulate different scenarios.
  4. test_id.py:

    • Tests the student_id function for accurate extraction of student IDs.
    • Checks for complete IDs with all digits filled.
    • Tests behavior with incomplete IDs, ensuring missing digits are represented as 'X'.
  5. test_extract.py:

    • Tests extractImagesFromPdf for correct image extraction from PDFs.
    • Mocks PDF files to simulate extraction without relying on actual files.
    • Validates that images are saved with correct dimensions and naming conventions.
  6. test_error_handling.py:

    • Tests how the system handles errors such as corrupted PDFs or missing image files.
    • Ensures the system doesn't crash and logs appropriate error messages.

Integration Test:

  1. test_workflow.py:
    • Simulates the full workflow from PDF upload to JSON generationMocks methods interacting with external systems or files.
    • Verifies that the system correctly processes PDFs and outputs the expected JSON structure.

Helper Class: Scantron95945TestHelper: A subclass of Scantron95945 used to adjust certain methods for testing purposes, such as lowering thresholds in get_responses_bubble_row.

Other Comments: I've also manually tested our program with scantron sheets with two different image qualities: one at 200dpi, and the other at 600dpi. The accuracy for both image qualities was perfect (100%).

All of the unit tests I wrote pass:

Screenshot 2024-09-29 at 11 50 11 PM
kate-holdener commented 1 month ago

@ramezmosad this is an impressive test suite! When I run the tests, they are not finding the PDF directory. Can you share how you run these tests (from which directory and what environment variables do you set)?