Open ghost opened 4 years ago
So there's the following example in the README:
from yolo_v2 import YOLOV2_ANCHOR_PRIORS as priors
from yolov2_train import processGroundTruth
image = imread(image_path)
boxes, labels = fetch_bounding_boxes_and_labels()
y_true = processGroundTruth(boxes, labels, priors, (13, 13, 5, 25))
trainnet.m.fit(image[None], y_true[None], steps_per_epoch=30, epochs=10)
Your analysis of the code is completely correct, processGroundTruth
returns a numpy array of shape (13, 13, 5, 25)
, but then I pass it to the model's fit
function by doing y_true[None]
, which will convert it to (1, 13, 13, 5, 25)
.
so what you should be able to do is build a list of (bounding boxes, labels)
pairs (called y
in the following code) and then do:
y_true = np.asarray([
processGroundTruth(boxes, labels, priors, (13, 13, 5, 25))
for boxes, labels in y
])
and then do trainnet.m.fit(images, y_true, steps_per_epoch=30, epochs=10)
, where images
is an array of images.
does this clarify things?
Hello, i was wondering why the input shape to loss function is (13, 13, 5, 25), and you opt to leave out batch dimension, or am i looking at it wrong? I see ProcessGroundTruth returns y_true of shape (13, 13, 5, 25) which is input to loss function. In general, y_true and y_pred in loss function are going to have batch_size as their first dimension, or in this case number_of_samples/steps_per_epoch.