Paper-Chart-Extraction-Project / ChartExtractor

ChartExtractor uses computer vision to convert images of paper charts to digital data.
https://paper-chart-extraction-project.github.io/ChartExtractor/
GNU General Public License v3.0
3 stars 1 forks source link

Feature tiling #2

Closed RyanDoesMath closed 4 months ago

RyanDoesMath commented 4 months ago

Tiling

This pull request introduces two new functionalities to the image processing library:

Description of the Two New Functions

  1. tile_image: This function tiles an image into a grid of overlapping sub-images. This technique is useful for object detection models that struggle to detect small objects within a larger image. By creating these sub-images (tiles), the model can potentially better detect small objects that might be missed in the full image.
  2. tile_annotations: This function complements tile_image and takes a list of bounding boxes or keypoints (refer to BoundingBox and Keypoint classes) along with image tiling parameters. It assigns each annotation to the tiles that completely enclose it. This allows for associating annotations with the corresponding tiles created from the original image.

This pull request includes necessary validation checks for input parameters to ensure proper usage.

How to use these functions:

from PIL import Image

# Assuming you have an image loaded as 'image'
# Set tiling parameters (adjust as needed)
slice_width = 256
slice_height = 256
horizontal_overlap_ratio = 0.1  # 10% horizontal overlap
vertical_overlap_ratio = 0.2  # 20% vertical overlap

# Get a list of tiled sub-images
tiles = tile_image(image, slice_width, slice_height, horizontal_overlap_ratio, vertical_overlap_ratio)

# Assuming you have a list of bounding boxes 'annotations'
tiled_annotations = tile_annotations(annotations, image.width, image.height, slice_width, slice_height, horizontal_overlap_ratio, vertical_overlap_ratio)

# Now 'tiles' is a list of PIL Image objects representing sub-images

Annotation Data Classes: BoundingBox and Keypoint

This pull request also introduces two new classes, BoundingBox and Keypoint, to the image processing library. These classes facilitate working with object detection data, specifically for tasks involving bounding boxes and keypoints.

The BoundingBox Class

Represents a bounding box around an object in an image.

The Keypoint Class:

Represents a keypoint-boundingbox pair associated with an object in an image. These are used only for blood pressure and heart rate on the charts at the time of this writing.

These classes provide a foundation for working with object detection data in various formats. They offer functionalities to parse common object detection data formats (YOLO, COCO), validate data integrity, and generate output in the desired format. Any place that bounding boxes are used should from now on implement a new constructor in BoundingBox and create BoundingBox objects to work with (this will happen with the outputs of computer vision models when that feature is added).