ultralytics / yolov5

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
49.81k stars 16.12k forks source link

Normalized xywh values. #1523

Closed SamiurRahman1 closed 3 years ago

SamiurRahman1 commented 3 years ago

❔Question

I am using crowdhuman dataset for training. I have my training annotations in xywh format but as far as i understand, in pixel values. So, i convert them as suggested in the instructions like this:

img = Image.open(root+image_name)
img = img.convert("RGB")
imgheight,imgwidth = img.size
x,y,w,h = a['hbox']
 yolo_x = x/imgwidth
 yolo_w = w/imgwidth
 yolo_y = y/imgheight
 yolo_h = h/imgheight

After training however, in the logs, the bboxes in example prediction and labels in images are misplaced. Is there anything wrong with my conversion method?

Additional context

SamiurRahman1 commented 3 years ago

I am guessing i need to calculate the x_center like x_center = (x+w)/2 and then do: yolo_x = x_center/imgwidth is it correct?

abidKiller commented 3 years ago

i'm also facing the issue... maybe last update caused this bug

SamiurRahman1 commented 3 years ago

I am guessing i need to calculate the x_center like x_center = (x+w)/2 and then do: yolo_x = x_center/imgwidth is it correct?

Tried this. did not work.

glenn-jocher commented 3 years ago

@SamiurRahman1 @abidKiller git clone the latest code and if you see errors follow the advice in https://docs.ultralytics.com/yolov5/tutorials/train_custom_data.

Errors in train and test jpgs indicate errors in your data.

SamiurRahman1 commented 3 years ago

@SamiurRahman1 @abidKiller git clone the latest code and if you see errors follow the advice in https://docs.ultralytics.com/yolov5/tutorials/train_custom_data.

Errors in train and test jpgs indicate errors in your data.

this is the instruction that i followed. And about the error being in the data, the same annotations works perfectly when i draw a rectangle using the annotation data with cv2. but is my conversion method correct though?

glenn-jocher commented 3 years ago

@SamiurRahman1 conversion is explained in https://docs.ultralytics.com/yolov5/tutorials/train_custom_data

If your labels are correct your jpgs will be correct as well.

abidKiller commented 3 years ago

I used same dataset 2 days ago it was working fine, but when I tried to train today , i saw the misplaced bounding boxes in batch images.

abidKiller commented 3 years ago

As you can see , it was working perfectly 124272607_376235280268126_2843689430244463511_n

glenn-jocher commented 3 years ago

@abidKiller latest code is recommended.

Xinlei-Ren commented 3 years ago

@SamiurRahman1 @abidKiller git clone the latest code and if you see errors follow the advice in https://docs.ultralytics.com/yolov5/tutorials/train_custom_data.

Errors in train and test jpgs indicate errors in your data. Is it right to use the labelImg tool for labeling?

glenn-jocher commented 3 years ago

@Xinlei-Ren sure, you can use any labelling tool.

SamiurRahman1 commented 3 years ago

Found the problem. As suspected, my conversion method was wrong. I tried training with a different dataset, it worked. So the only possibility was my conversion method. Here is how to convert the crowdhuman data:

imgheight,imgwidth = img.size
x,y,w,h = a['hbox']   //for each tag in gtboxes object
x_min = x 
x_max = (x+w)
y_min = y
y_max = (y+h)
x_center = (x_min+x_max)/2.
y_center = (y_min+y_max)/2.
yolo_x = x_center/imgheight
yolo_w = w/imgwidth
yolo_y = y_center/imgwidth
yolo_h = h/imgheight

hope this helps someone.

glenn-jocher commented 3 years ago

@SamiurRahman1 if you'd like and if you have time, to help others you can submit a PR for the addition of this dataset with i.e. data/crowdhuman.yaml, and a download script in data/scripts/crowdhuman.sh.

nobody-cheng commented 2 years ago

Found the problem. As suspected, my conversion method was wrong. I tried training with a different dataset, it worked. So the only possibility was my conversion method. Here is how to convert the crowdhuman data:

imgheight,imgwidth = img.size
x,y,w,h = a['hbox']   //for each tag in gtboxes object
x_min = x 
x_max = (x+w)
y_min = y
y_max = (y+h)
x_center = (x_min+x_max)/2.
y_center = (y_min+y_max)/2.
yolo_x = x_center/imgheight
yolo_w = w/imgwidth
yolo_y = y_center/imgwidth
yolo_h = h/imgheight

hope this helps someone.

why ??

yolo_x = x_center/imgheight
yolo_y = y_center/imgwidth
# =================
yolo_x = x_center/imgwidth
yolo_y = y_center/imgheight
glenn-jocher commented 10 months ago

@nobody-cheng My apologies for the confusion. You are absolutely right. The correct conversion should be:

yolo_x = x_center/imgwidth
yolo_w = w/imgwidth
yolo_y = y_center/imgheight
yolo_h = h/imgheight

Thank you for catching that mistake!