thtrieu / darkflow

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices
GNU General Public License v3.0
6.13k stars 2.08k forks source link

Bounding boxes offset and scale issues - tiny-yolo #563

Open raulincze opened 6 years ago

raulincze commented 6 years ago

Hi,

I'm training a tiny-yolo car detector (one class) on a custom dataset. The loss stalls after after reaching a value of ~5. When running the trained detector on a few images from the training set I've noticed that the boxes are present but the coordinates are off with a certain offset.

Example: 000129

I've followed all of the instructions in the readme file and browsed the issues for similar problems, but I couldn't find any useful information. I'm about to get my hands dirty and dig into the source for a bit.

Has any of you encountered such an issue before? Any luck fixing it?

anonymre commented 6 years ago

Hii raul did you fix the problem.

raulincze commented 6 years ago

No, I can't seem to find out what's causing this.

anonymre commented 6 years ago

Is the snapshot you took is from video or you just gave an image? Sorry Even i am struggling with similar problem.

raulincze commented 6 years ago

It's a simple image. I haven't tried on video, but I presume it would be the same.

RobotRobert commented 6 years ago

I have the exact same issue on a very different custom dataset. Also using tiny-yolo, and my predictions are all offset up and to the right of the real objects (as shown in your image). This appears to be influencing the training (not just a display issue).

Any help appreciated!

RobotRobert commented 6 years ago

Perhaps another clue - I did not have this problem when using the same network and configuration on a different (still custom) dataset which was produced in the same way.

The key differences were:

raulincze commented 6 years ago

Based on your input I've tried adding a dummy class to the detector I'm training, but the bounding boxes still have the same offset. So I think we can rule that out.

I've also computed my own anchors running k-means clustering on my bounding boxes, yet no success with that either.

In hopes this was just a post processing bug, I took a look at the yolo2_findboxes cython code, but it seems to follow what's described in the YOLOv2 paper closely.

I'm having a bit of a problem understanding the code in data.py. For instance, when calculating the regression target the width and height seems to be processed correctly, but in obj[1] and obj[2] it seems that the only things stored are the row and column of the grid cell corresponding to the center of the box. Yet neither the offset relative to the cell, nor the prior boxes (anchors) seem to be taken into account. Shouldn't calculating the regression target be the exact inverse of the function applied when post-processing network's output?

I'm also struggling a bit to understand what every entry in the loss_feed_val is and what their role is.

wdvr commented 6 years ago

Could be that your label bounding boxes are in the wrong coordinate system? 0,0 is left top, not left bottom. I just encountered something similar here.

RobotRobert commented 6 years ago

I managed to solve my particular instance of this problem. For me the issue was the dataset having image width and heigh parameters swapped, and so many of the object instances were being detected as outside of the feature grid.

2 things were quite useful for me in debugging this. Firstly uncommenting the debugging code in data.py and secondly partly reverse engineering what this code was doing. I've added some comments below which may help.

# Calculate regression target
cellx = 1. * w / W # width of each cell (pixels) = image width / cells width
celly = 1. * h / H # height of each cell (pixels)  = image height / cells height
for obj in allobj:
    centerx = .5*(obj[1]+obj[3]) # xmin, xmax
    centery = .5*(obj[2]+obj[4]) # ymin, ymax
    # print("centerx, centery = ", centerx, centery)
    cx = centerx / cellx # center of the object (unit: cell as float) [horizontal]
    cy = centery / celly # center of the object (unit: cell as float) [vertical]
    if cx >= W or cy >= H: 
        print("Path: ", path)
        print("cellx, celly: ", cellx, celly)
        print("cx >= W or cy >= H", cx, cy, W, H)
        print("Obj: ",  obj[1], obj[2], obj[3], obj[4])
        return None, None
    obj[3] = float(obj[3]-obj[1]) / w # Width of the object as a fraction of image width
    obj[4] = float(obj[4]-obj[2]) / h # height of the object as a fraction of image height
    obj[3] = np.sqrt(obj[3])
    obj[4] = np.sqrt(obj[4])
    obj[1] = cx - np.floor(cx) # centerx, within the cell (unit: cells) [horizontal]
    obj[2] = cy - np.floor(cy) # centery, within the cell (unit: cells) [vertical]

    obj += [int(np.floor(cy) * W + np.floor(cx))] # += 1D index of the cell