devrimcavusoglu / pybboxes

Light weight toolkit for bounding boxes providing conversion between bounding box types and simple computations.
MIT License
144 stars 14 forks source link

[Feature Request]: Map Location #5

Closed harshraj22 closed 2 years ago

harshraj22 commented 2 years ago

I have an image, and I take a crop from the image. Now I find a bounding box in that crop (top-left of crop is 0,0). I know the position of the crop in original image. Can you add a feature to map this bounding box from crop-relative coordinates to original-image-relative coordintes ?

devrimcavusoglu commented 2 years ago

Hey @harshraj22 thank you for the interest. This would be a good improvement. There are some awaiting structural improvements/changes before that and we can add this to the roadmap, which is sounds like a nice feature. Though, currently I cannot give ETA right away.

harshraj22 commented 2 years ago

Sure @devrimcavusoglu , Let me know if the repo is open for contributions and if there's anyway I could help. I liked the library, and would be happy to improve it further.

devrimcavusoglu commented 2 years ago

@harshraj22 Thank you, contributions and PRs are always welcomed :). Currently I am working on the structural changes particularly, adding classes suport for bounding boxes. As I said, I will take this feature into account right after class implementation is completed (simply because I do not want to introduce breaking changes afterwards) :slightly_smiling_face:.

devrimcavusoglu commented 2 years ago

It is now add to the roadmap 🙂 @harshraj22 are you still interested in contributing ? If so, please review the code base as there are huge structural changes in v0.1.0.

harshraj22 commented 2 years ago

Hi @devrimcavusoglu , yes I am still interested in contributing. Before I start, I had a few questions related to current codebase:

  1. https://github.com/devrimcavusoglu/pybboxes/blob/dd4598cb8feae79962f6092e73e9ccb1082cf841/pybboxes/types/base.py#L105 The base bounding class assumes knowledge about the bounding box types that exist. If new type of bounding boxes are to be added, this class will have to be modified to add new methods. I am not sure why would you add these methods to the base class when all the inhering class have to implement to_voc and from_voc methods. Any conversion across bounding boxes can be carried out using these two methods. Right ?

I understand that you might have made it to make the API more userfriendly (https://github.com/devrimcavusoglu/pybboxes/blob/dd4598cb8feae79962f6092e73e9ccb1082cf841/pybboxes/functional.py#L36),, but the target bounding box class could have been identified using the string, and then used the from_voc method instead ? What's your thought about it ?

eg.

# Note: this is not exact implementation, but just a rough pseudocode.

REGISTERED_BBOXES = {
"voc": VocBoundingBox,
"yolo": YoloBoundingBox,
...
}

def convert_bbox(....):
  return REGISTERED_BBOXES[to_type].from_voc(
    REGISTERED_BBOXES[from_type](...)
  )
devrimcavusoglu commented 2 years ago

Thank you for the reply and the remark @harshraj22. Actually, this was my intention at first.

I understand that you might have made it to make the API more userfriendly

and BoundingBox currently acts like a factory class convert_bbox() is still there but the inner parts have been modified from completely functional factory to the current state utilizing BoundingBox class. Adding to_{box_type} to the base class BoundingBox was to provide more user friendly API where you have any type of bounding box object, you can just convert it by just calling to_{desired_box_type}(), of course this implementation comes with an issue that you stated: if a new bounding box to be added, the base class needs to be modified s.t it has the new bounding box type conversion.

but the target bounding box class could have been identified using the string, and then used the from_voc method instead ? What's your thought about it ?

As I mentioned in the beginning, having conversion methods in the bounding box class such as to_coco, to_yolo is desired as you do not need to have to use convert_bbox at all, it is solely for functional purpose and to assess backward compatibility.

Moreover, we do not need a hardcoded dictionary, in your example, we already have it implemented in load_bbox() factory function which loads the desired bounding box object from the name anyway, see load_bbox. However, I understand what you mean with the pseudo-code, as far as I understand, we do not really need "to_type" conversion, but having the "to_type" load the box in "to_type" and just read it from from_type, this is somewhat identical imo though if you see a great impact on this, then of course we can discuss further. All in all, the idea is without needing convert_bbox the object can be converted by to_{desired_type}, and yes new potential boxes have to be added if the bounding box type is added.

harshraj22 commented 2 years ago

Alright. I understand your arguments. @devrimcavusoglu

So the feature that I requested, would basically need to have shift and scale operations defined. Since each bounding box can have different implementations of these (eg. a box of type x0, y0, x1, y1 would need all 4 points shifted, while the box of type x_center, y_center, height, width will need just two of them shifted for the shift operation), I was thinking to add these as abstractmethods in Box class. Let me know your thoughts about it.

Also, the variable names x_tl, x_br seems confusing to me and what they are expected to represent is not very clear. And do you have plans to push functional methods to a separate directory (backward compatibility can be achieved by importing all methods in __init__.py of the directory, and directory name can be kept as functional) ? Or all future functional methods will be defined in the same file ?

devrimcavusoglu commented 2 years ago

So the feature that I requested, would basically need to have shift and scale operations defined. Since each bounding box can have different implementations of these (eg. a box of type x0, y0, x1, y1 would need all 4 points shifted, while the box of type x_center, y_center, height, width will need just two of them shifted for the shift operation), I was thinking to add these as abstractmethods in Box class. Let me know your thoughts about it.

@harshraj22 So, there is an abstraction of bounding boxes actually, we took as a basis as VOC style (x_tl, y_tl, x_br, y_br) (tl: top-left, br: bottom-right although they need to be documented well in docstrings, I agree with you there I'll adress this documenting issue in an independent PR.). Therefore, we actually use the same underlying computations (VOC style) for any box and give the results in pixel values (unnormalized). Further, it doesn't matter for a box to be COCO or YOLO type for example, because the computations under the hood uses VOC style for any box type by applying conversion ofc. On the other hand, the values associated with the particular box type is preserved under self.values property, they will remain as original values (coco values for coco, yolo values for yolo, etc.), so actually we need to modify the attribute self._values in BaseBoundingBox and not in the Box class. If we do

My thought was to add a method (classmethod) called adjust(new_image_size) to BaseBoundingBox class, one way to achieve such functionality is to (1) convert the values to VOC, (2) adjust the values to the given new size, (3) set values inplace or return values (inferred by inplace).

class BaseBoundingBox(Box, ABC):
     @classmethod
     @abstractmethod
     def adjust(cls, new_image_size: Tuple[int, int]) -> "BaseBoundingBox":
         # this is abstract method and each box style must implement, it can call parent's `adjust()`
         ...
        return cls(...)  # <- return a new instance

something like this, I try to write type hints for guide, but there are many ways to implement it, I think you get my point. However, there is an important thing to remember that new_image_size may be smaller than the current image (for example think that we have bounding box associated with the bigger image, and want the relative box associated with the crop), so this method adjust() must be two way.

In this way, there is no need for inplace option, and if we want to add inplace option support it will be more intricate as we need to modife self.values as well as v1, v2, v3, v4 where this v_i is different for different types, so the way that I wrote above is better suited I assume.

Also, the variable names x_tl, x_br seems confusing to me and what they are expected to represent is not very clear as I mentioned, I'll address this problem in an independent PR, but for now I can give small description for you to work on this issue.

tl: top-left
br: bottom-right
c: center

And do you have plans to push functional methods to a separate directory (backward compatibility can be achieved by importing all methods in init.py of the directory, and directory name can be kept as functional) ? Or all future functional methods will be defined in the same file ?

This will be definetely in consideration as the coverage of functional increases, but there should be an independent PR for this also (in the future).

harshraj22 commented 2 years ago

Oh. I see. @devrimcavusoglu since scale (which requires just modifying coordinates) and shift (which also requires modifying image) are quite different operations, I was thinking of having two different abstractmethods in BaseBoundingBox instead of clubbing them into one adjust method. This would also make sure that we follow separation of concerns and would make it easier to test these properties individually. Let me know your thoughts about it.

As per my understanding of the discussion, following could be a concrete implementation of the shift method for the CocoBoundingBox: Let me know if my understanding is correct


    def shift(self, horizontal_threshold: float, vertical_threshold: float) -> "CocoBoundingBox":
        """Returns a new bounding box shifted by the given thresholds. The new
        bounding box has same image shape, and other properties as the current
        object.

        Parameters
        ----------
        horizontal_threshold : float
            The amount to be shifted in the horizontal axis.
        vertical_threshold : float
            The amount to be shifted in the vertical axis.

        Returns
        -------
        CocoBoundingBox
            The new bounding box.
        """
        x_tl, y_tl, x_br, y_br = self.to_voc(return_values=True)

        return CocoBoundingBox.from_voc(x_tl, y_tl, x_br, y_br, self.image_size, self.strict)
devrimcavusoglu commented 2 years ago

@harshraj22 Yes, the procedure will be like as you stated. Let's continue the discussion on PR after you open it.