hhk7734 / tensorflow-yolov4

YOLOv4 Implemented in Tensorflow 2.
MIT License
136 stars 75 forks source link

bad results when using trained weight #35

Closed flugenheimer closed 3 years ago

flugenheimer commented 4 years ago

When training i get good results and a mAP score of 91% on a custom dataset, i then save the model and try to load in a new yolo object: yolo = YOLOv4(tiny=False) yolo.classes = "/content/classes.names" yolo.make_model() yolo.load_weights("/content/backup/weights/yolov4-final.weights", weights_type="yolo")

calculating the again mAP with the loaded weight gives a mAP of 0% or in some cases low percentages. I don't understand why it is not working and giving consistent results ?

I have tried using the default saved weight after the training is done, and also saving it using: yolo.save_weights("/content/backup/weights/custom.weights", weights_type="yolo")

it seems like it is not really saving or loading the weights properly. not sure it is a bug or if I am missing a step?

hhk7734 commented 4 years ago

Can you share your training script?

flugenheimer commented 4 years ago
from tensorflow.keras import callbacks, optimizers
from yolov4.tf import SaveWeightsCallback, YOLOv4
import os, sys
import cv2
import numpy as np
from mapcalc import calculate_map

yolo = YOLOv4(tiny=False)
yolo.classes = "/content/classes.names"
yolo.input_size = 416
yolo.batch_size = 8
lr = 1e-5

yolo.make_model(activation1="relu")

yolo.load_weights(
    "/content/drive/My Drive/3d_modeller/yolo_dataset/yolov4.conv.137",
    weights_type="yolo"
)

train_data_set = yolo.load_dataset(
    "/content/train.txt",
    dataset_type="yolo",
    label_smoothing=0.05
)

val_data_set = yolo.load_dataset(
    "/content/test.txt",
    dataset_type="yolo",
    training=False
)

optimizer = optimizers.Adam(learning_rate=lr)
yolo.compile(optimizer=optimizer, loss_iou_type="ciou")

def lr_scheduler(epoch):

    if epoch < int(epochs * 0.5):
        return lr
    if epoch < int(epochs * 0.8):
        return lr * 0.5
    if epoch < int(epochs * 0.9):
        return lr * 0.1

    return lr

_callbacks = [
    callbacks.LearningRateScheduler(lr_scheduler),
    callbacks.TerminateOnNaN()
]

yolo.fit(
    train_data_set,
    epochs=epochs,
    callbacks=_callbacks,
    validation_data=val_data_set,
    validation_steps=valiration_steps,
    validation_freq=validation_freq,
    steps_per_epoch=steps_per_epoch,
)

yolo.save_weights("/content/backup/weights/custom.weights", weights_type="yolo")

This works fine for me, and calculating a mAP score based on this yolo object gives a good score

hhk7734 commented 4 years ago

What is epochs value? What is the number of classes? How many hours did you spend training?

flugenheimer commented 4 years ago

input_size = 416 batch_size = 8 epochs = 50 learning_rate = 1e-05 valiration_steps = 10 validation_freq = 5 steps_per_epoch = 50

there is only 1 class

takes about 30 minutes to train

I have also tried a lot of different values for the hyperparameters (just a for-loop testing values). No matter those settings the results are poor if i am using the weight afterwards

hhk7734 commented 4 years ago

Oh...

I found.

your training script uses yolo.make_model(activation1="relu") and inference script uses yolo.make_model()

yolo.make_model(activation1="relu") is for tiny on Google coral board..

flugenheimer commented 4 years ago

Oh...

I found.

your training script uses yolo.make_model(activation1="relu") and inference script uses yolo.make_model()

yolo.make_model(activation1="relu") is for tiny on Google coral board..

Just tried training with just yolo.make_model(), get a 90.5% mAP when running on that object, but when creating a new object and loading the weight, I still get mAP of 0%

hhk7734 commented 4 years ago

hmm... I think it could be if the data set is wrong.

I don't know if it will work on colab, try the code below.

import cv2
import numpy as np
import tensorflow as tf
from yolov4.tf import YOLOv4
from google.colab.patches import cv2_imshow

yolo = YOLOv4()
yolo.classes = "/content/classes.names"
yolo.input_size = 416
yolo.batch_size = 2

dataset = yolo.load_dataset(
    "/content/train.txt",
    dataset_type="yolo",
    label_smoothing=0.05
)

for i, (images, gt) in enumerate(dataset):
    for j in range(len(images)):
        _candidates = []
        for candidate in gt:
            grid_size = candidate.shape[1:3]
            _candidates.append(
                tf.reshape(
                    candidate[j], shape=(1, grid_size[0] * grid_size[1] * 3, -1)
                )
            )
        candidates = np.concatenate(_candidates, axis=1)

        frame = images[j, ...] * 255
        frame = frame.astype(np.uint8)

        pred_bboxes = yolo.candidates_to_pred_bboxes(candidates[0])
        pred_bboxes = yolo.fit_pred_bboxes_to_original(pred_bboxes, frame.shape)
        frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
        image = yolo.draw_bboxes(frame, pred_bboxes)
        cv2_imshow("result", image)
        while cv2.waitKey(10) & 0xFF != ord("q"):
            pass
    if i == 10:
        break
flugenheimer commented 4 years ago

your code gives me this error: TypeError: candidates_to_pred_bboxes() missing 2 required positional arguments: 'iou_threshold' and 'score_threshold'

just corrected it to work in google colab.

The data looks fine, objects are marked with a bounding box around them..

import cv2
import numpy as np
import tensorflow as tf
from yolov4.tf import YOLOv4
from google.colab.patches import cv2_imshow

yolo = YOLOv4()
yolo.classes = "/content/classes.names"
yolo.input_size = 416
yolo.batch_size = 2

dataset = yolo.load_dataset(
    "/content/train.txt",
    dataset_type="yolo"
    #label_smoothing=0.05
)

for i, (images, gt) in enumerate(dataset):
    for j in range(len(images)):
        _candidates = []
        for candidate in gt:
            grid_size = candidate.shape[1:3]
            _candidates.append(
                tf.reshape(
                    candidate[j], shape=(1, grid_size[0] * grid_size[1] * 3, -1)
                )
            )
        candidates = np.concatenate(_candidates, axis=1)

        frame = images[j, ...] * 255
        frame = frame.astype(np.uint8)

        pred_bboxes = yolo.candidates_to_pred_bboxes(candidates[0],iou_threshold=0.5,score_threshold=0.5)
        pred_bboxes = yolo.fit_pred_bboxes_to_original(pred_bboxes, frame.shape)
        frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
        image = yolo.draw_bboxes(frame, pred_bboxes)
        cv2_imshow(image)
    if i == 10:
        break
hhk7734 commented 4 years ago

In my case, the below code works well.

import cv2
import numpy as np
import tensorflow as tf
from yolov4.tf import YOLOv4
from google.colab.patches import cv2_imshow
import time

yolo = YOLOv4()
yolo.classes = "/content/drive/My Drive/Hard_Soft/NN/coco/coco.names"
yolo.input_size = 416
yolo.batch_size = 2

dataset = yolo.load_dataset(
    "/content/drive/My Drive/Hard_Soft/NN/coco/train2017.txt",
    label_smoothing=0.05,
    image_path_prefix="/content/train2017"
)

for i, (images, gt) in enumerate(dataset):
    for j in range(len(images)):
        _candidates = []
        for candidate in gt:
            grid_size = candidate.shape[1:3]
            _candidates.append(
                tf.reshape(
                    candidate[j], shape=(1, grid_size[0] * grid_size[1] * 3, -1)
                )
            )
        candidates = np.concatenate(_candidates, axis=1)

        frame = images[j, ...] * 255
        frame = frame.astype(np.uint8)

        pred_bboxes = yolo.candidates_to_pred_bboxes(candidates[0], iou_threshold = 0.1, score_threshold=0.25)
        pred_bboxes = yolo.fit_pred_bboxes_to_original(pred_bboxes, frame.shape)
        frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
        image = yolo.draw_bboxes(frame, pred_bboxes)
        cv2_imshow(image)
        time.sleep(0.5)
    if i == 10:
        break
flugenheimer commented 4 years ago

the data looks fine, with bounding boxes around each object

hhk7734 commented 4 years ago

hmm. I don't know what is wrong.

I don't know how you calculated the mAP, but it's weird that 90% came out. It is before v2.0.0 release, so it seems difficult to help more right now. After release, I will test it under conditions similar to yours.

hhk7734 commented 4 years ago

Ref: https://colab.research.google.com/drive/1dkpJCMEZdPz6fwgfKPW6VV1AiE6qS5uL?usp=sharing

Adeel-Intizar commented 4 years ago

your code gives me this error: TypeError: candidates_to_pred_bboxes() missing 2 required positional arguments: 'iou_threshold' and 'score_threshold'

just corrected it to work in google colab.

The data looks fine, objects are marked with a bounding box around them..

import cv2
import numpy as np
import tensorflow as tf
from yolov4.tf import YOLOv4
from google.colab.patches import cv2_imshow

yolo = YOLOv4()
yolo.classes = "/content/classes.names"
yolo.input_size = 416
yolo.batch_size = 2

dataset = yolo.load_dataset(
    "/content/train.txt",
    dataset_type="yolo"
    #label_smoothing=0.05
)

for i, (images, gt) in enumerate(dataset):
    for j in range(len(images)):
        _candidates = []
        for candidate in gt:
            grid_size = candidate.shape[1:3]
            _candidates.append(
                tf.reshape(
                    candidate[j], shape=(1, grid_size[0] * grid_size[1] * 3, -1)
                )
            )
        candidates = np.concatenate(_candidates, axis=1)

        frame = images[j, ...] * 255
        frame = frame.astype(np.uint8)

        pred_bboxes = yolo.candidates_to_pred_bboxes(candidates[0],iou_threshold=0.5,score_threshold=0.5)
        pred_bboxes = yolo.fit_pred_bboxes_to_original(pred_bboxes, frame.shape)
        frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
        image = yolo.draw_bboxes(frame, pred_bboxes)
        cv2_imshow(image)
    if i == 10:
        break

try to use iou_threshold=0.45 and score_threshold=0.1 I had the same error, but these values worked for me

flugenheimer commented 4 years ago

Ref: https://colab.research.google.com/drive/1dkpJCMEZdPz6fwgfKPW6VV1AiE6qS5uL?usp=sharing

i haven't tried with the coco dataset, but this part also works fine for me.

what works fine is:

my problem occurs when I then afterwards create a new yolo object: yolo_new = YOLOv4() yolo_new.classes = "classes.names" yolo_new.make_model()

load the weights of the model trained for "yolo_first": yolo_new.load_weights("/content/backup/weights/yolov4-final.weights", weights_type="yolo")

if I then try to predict on an image i would expect the same performance as when i did it with "yolo_first", as well as the same mAP score, but it is not. the performance is very poor.


So what is confusing to me is why it works on the specific model I train on, but not when i reuse the saved weights on a new model.

flugenheimer commented 4 years ago

hmm. I don't know what is wrong.

I don't know how you calculated the mAP, but it's weird that 90% came out. It is before v2.0.0 release, so it seems difficult to help more right now. After release, I will test it under conditions similar to yours.

I ended up using this library to calculate the mAP score: https://pypi.org/project/mapcalc/ to avoid having to save things.

i get 0.90 score (just converted it to 90%), the score is pretty high on my data as it is purely artificial data and therefore relatively easy to detect.

hhk7734 commented 4 years ago

It is difficult to answer something because I cannot reproduce the same. :confused:

flugenheimer commented 4 years ago

ill see if i can recreate it with a dataset that i can share here

flugenheimer commented 4 years ago

I tried to recreate the problem with a cat dataset. here are some of the results (https://colab.research.google.com/drive/1-bB9SIAiupsdlLU7CyqDQ0hJarEhynUo?usp=sharing):

Prediction on originally trained yolo detector: image

Prediction on new yolo detector with weights loaded from the trained one: image

Since it is the same image used for predictions, and should be the same weights, I would have expected the same results and scores?

I also tried using the mAP score method that you use: https://github.com/Cartucho/mAP.git the mAP score is calculated on the same dataset, only difference is, that the first is using the model i trained, and the second by loading the weights of the first one: image

hhk7734 commented 4 years ago

May have something to do with https://github.com/hunglc007/tensorflow-yolov4-tflite/issues/165. There seems to be something wrong with saving or loading values. First of all, I'll save the weights in TensorFlow format, and then do the same test as you did. And if there is no problem, I will examine the model itself.

Thank you :)

flugenheimer commented 4 years ago

May have something to do with hunglc007#165. There seems to be something wrong with saving or loading values. First of all, I'll save the weights in TensorFlow format, and then do the same test as you did. And if there is no problem, I will examine the model itself.

Thank you :)

Looking forwards to hearing about the test, and for a fix of the issue :)

also thank you for creating this yolo library to make it simpler to use and train yolo

hhk7734 commented 3 years ago

I tested the tf format. But no problem.

I read your script once again. yolo.input_size = 416, but yolo_new.input_size = 608(default). So, the results are different.

If you set yolo_new.input_size = 416 before yolo_new.make_model(), yolo and yolo_new will have the same results.

flugenheimer commented 3 years ago

I tested the tf format. But no problem.

I read your script once again. yolo.input_size = 416, but yolo_new.input_size = 608(default). So, the results are different.

If you set yolo_new.input_size = 416 before yolo_new.make_model(), yolo and yolo_new will have the same results.

Nice spottet :)

I tried to set the input size and visually it gives the same results, and the prediction score is the same on the same image now. image

I did manage to find a strange behaviour with the save_dataset_for_mAP function. you have to reload the dataset each time you want to use the function save_dataset_for_mAP, otherwise a strange multiplication (or something else) is happening to the ground truth values being outputted into the "mAP/input/ground-truth" folder: first time: cat_head 479 199 808 516 second time: cat_head 486440 203929 820469 529350 third time: cat_head 493737435 208823994 832776689 542054400 and so on...

Im going to close this issue and open a new one with this mAP behaviour :)