cj-mills / pytorch-yolox-object-detection-tutorial-code

This repository contains the training code for my PyTorch YOLOX object detection tutorial.
https://christianjmills.com/series/tutorials/pytorch-train-object-detector-yolox-series.html
MIT License
12 stars 2 forks source link

regarding #2

Closed bubbleyang111 closed 6 months ago

bubbleyang111 commented 6 months ago

in the section: https://christianjmills.com/posts/pytorch-train-object-detector-yolox-tutorial/#preparing-input-data, target_bboxes = [bbox*(resized_img.size*2) for bbox in target_bboxes]. Why here should multiply 2?

cj-mills commented 6 months ago

Hi @bubbleyang111,

In this context, the resized_img.size*2 duplicates the values in resized_img.size rather than multiplying them by 2.

For:

resized_img.size
(384, 512)

We get:

resized_img.size*2
(384, 512, 384, 512)

We then scale the values in bbox by the duplicated values.

target_bboxes
array([array([0.16778879, 0.86416834, 0.09460363, 0.13064645]),
       array([0.584254  , 0.37618581, 0.18003765, 0.13183916])],
      dtype=object)
[bbox*(resized_img.size*2) for bbox in target_bboxes]
[array([ 64.43089536, 442.45419008,  36.32779392,  66.8909824 ]),
 array([224.353536  , 192.60713472,  69.1344576 ,  67.50164992])]
bubbleyang111 commented 6 months ago

Thanks cj-mills. So it actually expand the dimensions to the same as bbox's, right?

cj-mills commented 6 months ago

To the same dimensions as the individual bounding boxes in target_bboxes, that's right.

We need to multiply the 0 and 2 index values (the x-coordinate and width) for each bounding box by the resized_img width and multiply the 1 and 3 index values (the y-coordinate and height) by the resized_img height.

bubbleyang111 commented 6 months ago

got it. thanks.