Closed bubbleyang111 closed 6 months ago
Hi @bubbleyang111,
In this context, the resized_img.size*2
duplicates the values in resized_img.size
rather than multiplying them by 2
.
For:
resized_img.size
(384, 512)
We get:
resized_img.size*2
(384, 512, 384, 512)
We then scale the values in bbox
by the duplicated values.
target_bboxes
array([array([0.16778879, 0.86416834, 0.09460363, 0.13064645]),
array([0.584254 , 0.37618581, 0.18003765, 0.13183916])],
dtype=object)
[bbox*(resized_img.size*2) for bbox in target_bboxes]
[array([ 64.43089536, 442.45419008, 36.32779392, 66.8909824 ]),
array([224.353536 , 192.60713472, 69.1344576 , 67.50164992])]
Thanks cj-mills. So it actually expand the dimensions to the same as bbox's, right?
To the same dimensions as the individual bounding boxes in target_bboxes
, that's right.
We need to multiply the 0
and 2
index values (the x-coordinate and width) for each bounding box by the resized_img
width and multiply the 1
and 3
index values (the y-coordinate and height) by the resized_img
height.
got it. thanks.
in the section: https://christianjmills.com/posts/pytorch-train-object-detector-yolox-tutorial/#preparing-input-data, target_bboxes = [bbox*(resized_img.size*2) for bbox in target_bboxes]. Why here should multiply 2?