Koldim2001 / YOLO-Patch-Based-Inference

Python library for YOLO small object detection and instance segmentation
GNU Affero General Public License v3.0
240 stars 14 forks source link

Issue related to extraction of segmentation coordinates #22

Closed Saanidhyavats closed 1 month ago

Saanidhyavats commented 1 month ago

I am using a YOLO segmentation model and can visualize the segmentation results. However, I am unable to extract the segmentation coordinates, although I can successfully extract the detection coordinates. Could you help me resolve this issue?

Koldim2001 commented 1 month ago

Good day. Can you provide the code how you run the patch inference code? I will be grateful to help you

Saanidhyavats commented 1 month ago

element_crops = MakeCropsDetectThem( image=small_img, model_path=weights, segment=True, show_crops=False, shape_x=512, shape_y=512, overlap_x=overlap_x, overlap_y=overlap_y, conf=conf, iou=iou, classes_list=[0], resize_initial_size=True, memory_optimize=False )

result = CombineDetections(element_crops, nms_threshold=nms_threshold)

coordinates=result.filtered_boxes # coordinates of detected region

visualize_results( img=img, confidences=result.filtered_confidences, boxes=coordinates, masks=result.filtered_masks, classes_ids=result.filtered_classes_id, classes_names=result.filtered_classes_names, thickness=20, show_boxes=True, fill_mask=True, delta_colors=3, show_class=True, axis_off=False, return_image_array=True )

Saanidhyavats commented 1 month ago

I tried to use polygons = result.filtered_polygons to extract the coordinates of segmented region but it is returning an empty list

Koldim2001 commented 1 month ago

Is result.filtered_boxes an empty array? If yes, it means that no objects were found, and you should experiment with the thresholds for your network + try other patch configurations. To make it easier, I suggest looking at what the patches look like by setting show_crops=True.

If result.filtered_boxes is full, then you should check the result.filtered_polygons object, as it contains the polygons obtained as a result of instance segmentation.

If you want to display them, you can do it like this:

visualize_results(
    img=result.image,
    confidences=result.filtered_confidences,
    boxes=result.filtered_boxes,
    polygons=result.filtered_polygons,
    classes_ids=result.filtered_classes_id,
    classes_names=result.filtered_classes_names,
    segment=True,
    fill_mask=True,
    show_boxes=False,
    show_class=False,
)

If you want to get the binary mask arrays from the polygons, you can do it like this:

from patched_yolo_infer import create_masks_from_polygons

masks = create_masks_from_polygons(polygons=result.filtered_polygons, image=result.image)
Saanidhyavats commented 1 month ago

result.filtered_boxes has values and I am able to see the instance segmentation results on the image using visualize_results but result.filtered_polygons is giving me an empty array for some reason (in the case where one can see the segmentation regions).

Koldim2001 commented 1 month ago

@Saanidhyavats I understood the reason. If memory_optimize=False mode is selected, the segmentation result is in results.filtered_masks

Koldim2001 commented 1 month ago

The difference in the approach to using the function lies in specifying the parameter memory_optimize=False in the MakeCropsDetectThem class. In such a case, the informative values after processing will be the following:

img: This attribute contains the original image on which the inference was performed. It provides context for the detected objects.

confidences: This attribute holds the confidence scores associated with each detected object. These scores indicate the model's confidence level in the accuracy of its predictions.

boxes: These bounding boxes are represented as a list of lists, where each list contains four values: [x_min, y_min, x_max, y_max]. These values correspond to the coordinates of the top-left and bottom-right corners of each bounding box.

masks: This attribute provides segmentation binary masks corresponding to the detected objects. These masks can be used to precisely delineate object boundaries.

classes_ids: This attribute contains the class IDs assigned to each detected object. These IDs correspond to specific object classes defined during the model training phase.

classes_names: These are the human-readable names corresponding to the class IDs. They provide semantic labels for the detected objects, making the results easier to interpret.

Here's how you can obtain them:

img=result.image
confidences=result.filtered_confidences
boxes=result.filtered_boxes
masks=result.filtered_masks
classes_ids=result.filtered_classes_id
classes_names=result.filtered_classes_names
Koldim2001 commented 1 month ago

@Saanidhyavats If you set memory_optimize=True, the calculation speed will be higher, the accuracy will be slightly lower. Just in this case, information about masks will be encoded in polygons

Saanidhyavats commented 1 month ago

@Koldim2001 thank you for helping me with the valuable information. Is it possible to add this info somewhere so that users in future know about it. I considered memory_optimize as an argument to deal with speed and accuracy.

Koldim2001 commented 1 month ago

Yea this information is in the Readme. you can look there all the detailed documentation is presented on the issue of working with the memory optimize mode. I also advise you to look at Google collab notebooks which are also attached in the readme as documentation. they just have a bunch of different examples and analyzed cases.

Koldim2001 commented 1 month ago

See the section of the readme titled "How to improve the quality of the algorithm for the task of instance segmentation"