sassoftware / python-dlpy

The SAS Deep Learning Python (DLPy) package provides the high-level Python APIs to deep learning methods in SAS Visual Data Mining and Machine Learning. It allows users to build deep learning models using friendly Keras-like APIs.
Apache License 2.0
224 stars 131 forks source link

enabled bounding box filtering when creating object detection table #358

Closed dxq77dxq closed 3 years ago

dxq77dxq commented 3 years ago

Added three more options: check_bbox, min_bbox_width and min_bbox_height to help filter bounding boxes. If the width or height is lower than threshold, the corresponding bounding box won't be included in the data.

michaelgorkow commented 3 years ago

First of all: Good that this will finally be implemented. :-) Nevertheless I don't understand why you haven't simply merged my PR as it had more functionality (allowing to drob bboxes outside of the image) and also included a test-case...

Second: Not sure about your experience, but I usually have label data in csv-files where each line describes a bbox. The _create_object_detection_table_noxml-function is much more useful to my mind.

Third: I would recommend to have a separate check_bbox-function which can be called by both _create_object_detectiontable-functions.

dxq77dxq commented 3 years ago

First of all: Good that this will finally be implemented. :-) Nevertheless I don't understand why you haven't simply merged my PR as it had more functionality (allowing to drob bboxes outside of the image) and also included a test-case...

Second: Not sure about your experience, but I usually have label data in csv-files where each line describes a bbox. The _create_object_detection_table_noxml-function is much more useful to my mind.

Third: I would recommend to have a separate check_bbox-function which can be called by both _create_object_detectiontable-functions.

Thanks for the initial implementation on this new feature. While your code gave more flexibility on ways to drop bounding boxes, there are several potential issues:

  1. Based on our experience, most internal/external users would follow our example and use create_object_detection_table() function instead of create_object_detection_table_no_xml(), so we decided to put this feature here;
  2. Your initial implementation only works for "yolo" format while we want it to cover both "yolo" and "coco" format;
  3. Dropping bounding boxes using [x1, y1, x2, y2] sometimes causes issues - say there is a person near the boundary of the image, then x1 value could be small but if the bounding box itself is large enough we still want to keep it;
  4. We requested some changes to your initial pull request to correct typo / cover more cases. We weren't able to merge your pull request without those issues properly addressed.

Thank you again for contributing to DLPy. Please let me know if you have some other suggestions.

michaelgorkow commented 3 years ago
  1. That’s only due too bad documentation. Both functions are still undocumented in read the docs.
  2. Agree, even though yolo is the most used. Would be best to have all.
  3. Not really sure whether you understood the idea behind it. It was meant to drop bounding boxes which are partly outside the image and therefore don’t contain the object anymore. Say you want to detect a person but only it’s feet are in the Image.
  4. I haven’t seen any requests for changes.
dxq77dxq commented 3 years ago
  1. That’s only due too bad documentation. Both functions are still undocumented in read the docs.
  2. Agree, even though yolo is the most used. Would be best to have all.
  3. Not really sure whether you understood the idea behind it. It was meant to drop bounding boxes which are partly outside the image and therefore don’t contain the object anymore. Say you want to detect a person but only it’s feet are in the Image.
  4. I haven’t seen any requests for changes.

My apologies if I didn't fully understand your idea. To my knowledge, we adjust the actual dimensions of bounding boxes when you resize the original images, so if there exists a bounding box, the the values are valid unless you crop the images. It's not easy to determine if the box contains most part of the person or just a foot based on x_min or y_min. This part of work should be done when you create the bounding boxes using annotation tools.