Closed harshraj22 closed 2 years ago
Hey @harshraj22 thank you for the interest. This would be a good improvement. There are some awaiting structural improvements/changes before that and we can add this to the roadmap, which is sounds like a nice feature. Though, currently I cannot give ETA right away.
Sure @devrimcavusoglu , Let me know if the repo is open for contributions and if there's anyway I could help. I liked the library, and would be happy to improve it further.
@harshraj22 Thank you, contributions and PRs are always welcomed :). Currently I am working on the structural changes particularly, adding classes suport for bounding boxes. As I said, I will take this feature into account right after class implementation is completed (simply because I do not want to introduce breaking changes afterwards) :slightly_smiling_face:.
It is now add to the roadmap 🙂 @harshraj22 are you still interested in contributing ? If so, please review the code base as there are huge structural changes in v0.1.0
.
Hi @devrimcavusoglu , yes I am still interested in contributing. Before I start, I had a few questions related to current codebase:
to_voc
and from_voc
methods. Any conversion across bounding boxes can be carried out using these two methods. Right ?I understand that you might have made it to make the API more userfriendly (https://github.com/devrimcavusoglu/pybboxes/blob/dd4598cb8feae79962f6092e73e9ccb1082cf841/pybboxes/functional.py#L36),, but the target bounding box class could have been identified using the string, and then used the from_voc
method instead ? What's your thought about it ?
eg.
# Note: this is not exact implementation, but just a rough pseudocode.
REGISTERED_BBOXES = {
"voc": VocBoundingBox,
"yolo": YoloBoundingBox,
...
}
def convert_bbox(....):
return REGISTERED_BBOXES[to_type].from_voc(
REGISTERED_BBOXES[from_type](...)
)
Thank you for the reply and the remark @harshraj22. Actually, this was my intention at first.
I understand that you might have made it to make the API more userfriendly
and BoundingBox
currently acts like a factory class convert_bbox()
is still there but the inner parts have been modified from completely functional factory to the current state utilizing BoundingBox
class. Adding to_{box_type}
to the base class BoundingBox
was to provide more user friendly API where you have any type of bounding box object, you can just convert it by just calling to_{desired_box_type}()
, of course this implementation comes with an issue that you stated: if a new bounding box to be added, the base class needs to be modified s.t it has the new bounding box type conversion.
but the target bounding box class could have been identified using the string, and then used the from_voc method instead ? What's your thought about it ?
As I mentioned in the beginning, having conversion methods in the bounding box class such as to_coco
, to_yolo
is desired as you do not need to have to use convert_bbox
at all, it is solely for functional purpose and to assess backward compatibility.
Moreover, we do not need a hardcoded dictionary, in your example, we already have it implemented in load_bbox()
factory function which loads the desired bounding box object from the name anyway, see load_bbox. However, I understand what you mean with the pseudo-code, as far as I understand, we do not really need "to_type" conversion, but having the "to_type" load the box in "to_type" and just read it from from_type
, this is somewhat identical imo though if you see a great impact on this, then of course we can discuss further. All in all, the idea is without needing convert_bbox
the object can be converted by to_{desired_type}
, and yes new potential boxes have to be added if the bounding box type is added.
Alright. I understand your arguments. @devrimcavusoglu
So the feature that I requested, would basically need to have shift
and scale
operations defined. Since each bounding box can have different implementations of these (eg. a box of type x0, y0, x1, y1
would need all 4 points shifted, while the box of type x_center, y_center, height, width
will need just two of them shifted for the shift operation), I was thinking to add these as abstractmethods in Box class. Let me know your thoughts about it.
Also, the variable names x_tl
, x_br
seems confusing to me and what they are expected to represent is not very clear. And do you have plans to push functional
methods to a separate directory (backward compatibility can be achieved by importing all methods in __init__.py
of the directory, and directory name can be kept as functional
) ? Or all future functional methods will be defined in the same file ?
So the feature that I requested, would basically need to have
shift
andscale
operations defined. Since each bounding box can have different implementations of these (eg. a box of typex0, y0, x1, y1
would need all 4 points shifted, while the box of typex_center, y_center, height, width
will need just two of them shifted for the shift operation), I was thinking to add these as abstractmethods in Box class. Let me know your thoughts about it.
@harshraj22 So, there is an abstraction of bounding boxes actually, we took as a basis as VOC style (x_tl, y_tl, x_br, y_br) (tl: top-left, br: bottom-right although they need to be documented well in docstrings, I agree with you there I'll adress this documenting issue in an independent PR.). Therefore, we actually use the same underlying computations (VOC style) for any box and give the results in pixel values (unnormalized). Further, it doesn't matter for a box to be COCO or YOLO type for example, because the computations under the hood uses VOC style for any box type by applying conversion ofc. On the other hand, the values associated with the particular box type is preserved under self.values
property, they will remain as original values (coco values for coco, yolo values for yolo, etc.), so actually we need to modify the attribute self._values
in BaseBoundingBox
and not in the Box
class. If we do
My thought was to add a method (classmethod) called adjust(new_image_size)
to BaseBoundingBox
class, one way to achieve such functionality is to (1) convert the values to VOC, (2) adjust the values to the given new size, (3) set values inplace or return values (inferred by inplace
).
class BaseBoundingBox(Box, ABC):
@classmethod
@abstractmethod
def adjust(cls, new_image_size: Tuple[int, int]) -> "BaseBoundingBox":
# this is abstract method and each box style must implement, it can call parent's `adjust()`
...
return cls(...) # <- return a new instance
something like this, I try to write type hints for guide, but there are many ways to implement it, I think you get my point. However, there is an important thing to remember that new_image_size
may be smaller than the current image (for example think that we have bounding box associated with the bigger image, and want the relative box associated with the crop), so this method adjust()
must be two way.
In this way, there is no need for inplace option, and if we want to add inplace option support it will be more intricate as we need to modife self.values
as well as v1, v2, v3, v4
where this v_i
is different for different types, so the way that I wrote above is better suited I assume.
Also, the variable names x_tl, x_br seems confusing to me and what they are expected to represent is not very clear as I mentioned, I'll address this problem in an independent PR, but for now I can give small description for you to work on this issue.
tl: top-left
br: bottom-right
c: center
And do you have plans to push functional methods to a separate directory (backward compatibility can be achieved by importing all methods in init.py of the directory, and directory name can be kept as functional) ? Or all future functional methods will be defined in the same file ?
This will be definetely in consideration as the coverage of functional
increases, but there should be an independent PR for this also (in the future).
Oh. I see.
@devrimcavusoglu since scale
(which requires just modifying coordinates) and shift
(which also requires modifying image) are quite different operations, I was thinking of having two different abstractmethods in BaseBoundingBox
instead of clubbing them into one adjust
method. This would also make sure that we follow separation of concerns
and would make it easier to test these properties individually. Let me know your thoughts about it.
As per my understanding of the discussion, following could be a concrete implementation of the shift
method for the CocoBoundingBox
:
Let me know if my understanding is correct
def shift(self, horizontal_threshold: float, vertical_threshold: float) -> "CocoBoundingBox":
"""Returns a new bounding box shifted by the given thresholds. The new
bounding box has same image shape, and other properties as the current
object.
Parameters
----------
horizontal_threshold : float
The amount to be shifted in the horizontal axis.
vertical_threshold : float
The amount to be shifted in the vertical axis.
Returns
-------
CocoBoundingBox
The new bounding box.
"""
x_tl, y_tl, x_br, y_br = self.to_voc(return_values=True)
return CocoBoundingBox.from_voc(x_tl, y_tl, x_br, y_br, self.image_size, self.strict)
@harshraj22 Yes, the procedure will be like as you stated. Let's continue the discussion on PR after you open it.
I have an image, and I take a crop from the image. Now I find a bounding box in that crop (top-left of crop is 0,0). I know the position of the crop in original image. Can you add a feature to map this bounding box from crop-relative coordinates to original-image-relative coordintes ?