Which part of the code converts .xml annotations(xmin,xmax,ymin,ymax) to the required format(x,y,w,h)?

experiencor / keras-yolo2

Easy training on custom dataset. Various backends (MobileNet and SqueezeNet) supported. A YOLO demo to detect raccoon run entirely in brower is accessible at https://git.io/vF7vI (not on Windows).

MIT License

1.73k stars 784 forks source link

Which part of the code converts .xml annotations(xmin,xmax,ymin,ymax) to the required format(x,y,w,h)? #273

Closed ManjeeraJagiri closed 6 years ago

ManjeeraJagiri commented 6 years ago

Hi,

We provide .xml annotations for each image where xmin.xmax,ymin,ymax are the BB dimensions of each object. I am wondering where in the code are these converted to coordinates relative to the grid cells i.e (13,13,5,6) ? (Assuming I am using 5 anchor boxes and 1 class). Is it at the below line? train_batch = BatchGenerator(train_imgs, generator_config, norm=normalize)

ZacharyForrest commented 6 years ago

check in preprocessing.py

ManjeeraJagiri commented 6 years ago

Hi,

I couldn't really understand the BatchGenerator code in preprocessing.py. Any way to get a gist of it?

ZacharyForrest commented 6 years ago

center_w = (obj['xmax'] - obj['xmin']) / (float(self.config['IMAGE_W']) / self.config['GRID_W']) # unit: grid cell center_h = (obj['ymax'] - obj['ymin']) / (float(self.config['IMAGE_H']) / self.config['GRID_H']) # unit: grid cell

ManjeeraJagiri commented 6 years ago

I got it now! Thanks.