Open kylemcdonald opened 4 years ago
@kylemcdonald Thanks for sharing. Also we have to tailor the yolo_loss
function for the rectangle loss as follows:
def yolo_loss(y_true, y_pred):
# 3. inverting the pred box equations
grid_size = tf.cast(tf.shape(y_true)[1:3][::-1], tf.float32)
grid = tf.meshgrid(tf.range(grid_size[0]), tf.range(grid_size[1]))
# (Optional) if you have normalized your anchors by `height/width`, this step is not necessary
true_wh = tf.math.log(true_wh / anchors * grid / grid_size[0])
However, if a priori anchors have been normalized by height/width
, we don't need to adjust the ratio of anchors
, e.g. if an anchor (5, 5)
is normalized proportionally to (5/576, 5/320)
, in this case, the following line in yolo_boxes()
is not necessary.
def yolo_boxes(pred, anchors, classes):
# (Optional) if you have normalized your anchors by `height/width`, this step is not necessary
box_wh = tf.exp(box_wh) * anchors * (grid_size[0] / grid_size)
Hi, when you train on rectangular images, don't you also need to modify the code here (dataset.py):
def transform_targets(y_train, anchors, anchor_masks, width, height):
y_outs = []
grid_size_w = width // 32
grid_size_h = height // 32
--------------------------
for anchor_idxs in anchor_masks:
y_outs.append(transform_targets_for_output(
y_train, grid_size_w, grid_size_h, anchor_idxs))
grid_size_w *= 2
grid_size_h *= 2
return tuple(y_outs)
and here:
@tf.function # TODO: check here if it is ok!!!
def transform_targets_for_output(y_true, grid_size_w, grid_size_h, anchor_idxs):
# y_true: (N, boxes, (x1, y1, x2, y2, class, best_anchor))
N = tf.shape(y_true)[0]
# y_true_out: (N, grid, grid, anchors, [x, y, w, h, obj, class])
y_true_out = tf.zeros(
(N, grid_size_h, grid_size_w, tf.shape(anchor_idxs)[0], 6))
---------------------------------------------------------
anchor_idx = tf.cast(tf.where(anchor_eq), tf.int32)
grid_xy = tf.cast(box_xy // ((1/grid_size_w),
(1/grid_size_h)), tf.int32)
And finally, you will have to adjust your model to take a custom input size of (width x height) in models.py:
def YoloV3(width=None, height=None, channels=3, anchors=yolo_anchors,
masks=yolo_anchor_masks, classes=80, training=False):
x = inputs = Input([height, width, channels], name='input')
Hello, and thanks for your work.
I would like to use a network I trained using https://github.com/AlexeyAB/darknet/ It accepts 576x320x3 input and predicts 4 classes.
I started by converting the network:
Then I ran my code (note: I had to hardcode
yolo_iou_threshold
andyolo_iou_score
for this code to run):But I get an error. I also tried with
size=None
and got the same error. I think it's might be related to some way that the network is created under an assumption equal width and height? But I can't find where. How can I fix this? Note that I don't get the error on the 320x320 network. I can also pass other square sizes to the network like 160x160 or 640x640 without problems. I've pasted the error below. Thank you!Updates: it looks like the code that needs to be changed is inside
def yolo_boxes()
. I had to account for the fact thatgrid_size
is different in each axis. When porting models from Darknet, this also means scaling the anchors by the aspect ratio. Here's what I did to get it working:Note that I copied my anchors from the Darknet
yolov3.cfg
file and used them like this: