Closed rainley123 closed 5 years ago
I've got the same problems, I also used some functions that provided by this repository. I used VOC2012 as the dataset. I found that some bounding boxes can not be encoded with the anchors, because the jaccard IOU is lower than 0.5. I change the bach_size to 1, and print out the loss every batch. Some times the loss is 0. I think it is because the bounding boxes can not compatible with the anchors. We may try to use clustering to find the most suitable anchor size like yolo.
In this repository, the function bboxes_encode() doesn't find the suitable anchors? And I can't understand some codes in this repository. Something different with you is that my loss doesn't get 0.
I have the same problem with the loss. It is very unstable and converges badly. This is how the training process looks like:
I have used the following settings:
dataset: VOC2012 batch_size = 32 loss_alpha = 1 negative_ratio = 3 match_threshold = 0.5 label_smoothing = 0.0 weight_decay = 0.0005
MOVING_AVERAGE_DECAY = 0.9999
LEARNING_RATE_DECAY_FACTOR = 0.94
INITIAL_LEARNING_RATE = 0.001
MOMENTUM = 0.9
SAMPLES_PER_EPOCH = 17125
EPOCHS_PER_DECAY = 2.0
In this repository, the function bboxes_encode() doesn't find the suitable anchors? And I can't understand some codes in this repository. Something different with you is that my loss doesn't get 0.
I tried to set the batch size to 1, train 1 image at a time. Sometimes, the loss is 0. However, it doesn't matter. I tried to set the batch size to 16, and the model converge after 17000 batches.
I have a problem with my codes, I used the core code in this commit, such as net(), anchors(), bboxes_encode(), bboxes_decode() .etc, and I provide the dataset using tf.data. It can run successfully, however, the losses is between 80-100, and it is always shaking and can not converge. If anyone can tell me where is the problem, the data? net? or optimizer ?