experiencor / keras-yolo2

Easy training on custom dataset. Various backends (MobileNet and SqueezeNet) supported. A YOLO demo to detect raccoon run entirely in brower is accessible at https://git.io/vF7vI (not on Windows).
MIT License
1.73k stars 784 forks source link

Bugs if grid size differ, ex. if image is not square #244

Open exipilis opened 6 years ago

exipilis commented 6 years ago

This code fails if you try setting grid_w and grid_h with different values, for example, you decided to use Yolo for images which are not square:

cell_x = tf.to_float(tf.reshape(tf.tile(tf.range(self.grid_w), [self.grid_h]), (1, self.grid_h, self.grid_w, 1, 1)))
cell_y = tf.transpose(cell_x, (0,2,1,3,4))
cell_grid = tf.tile(tf.concat([cell_x,cell_y], -1), [self.batch_size, 1, 1, self.nb_box, 1])

https://github.com/experiencor/keras-yolo2/blob/master/frontend.py#L92

ValueError: Dimension 1 in both shapes must be equal, but are 13 and 23 for 'loss/reshape_1_loss/concat' (op: 'ConcatV2') with input shapes: [1,13,23,1,1], [1,23,13,1,1], [] and with computed input tensors: input[2] = <-1>.
evgevd commented 6 years ago

You should change that part of code in this way:

cell_x = tf.to_float(tf.reshape(tf.tile(tf.range(GRID_W), [GRID_H]), (1, GRID_H, GRID_W, 1, 1)))
cell_y = tf.to_float(tf.reshape(tf.tile(tf.range(GRID_H), [GRID_W]), (1, GRID_W, GRID_H, 1, 1)))
cell_y = tf.transpose(cell_y, (0,2,1,3,4))

cell_grid = tf.tile(tf.concat([cell_x,cell_y], -1), [BATCH_SIZE, 1, 1, BOX, 1])

So we get a grid where every cell keeps cell's coordinates. It will work with squaresd grid too

rodrigo2019 commented 6 years ago

it is working in my repo, you can check it here. I did something very similiar from what @evgevd said

fenilsuchak commented 5 years ago

Nope @rodrigo2019 still the same error with your fork