hhk7734 / tensorflow-yolov4

YOLOv4 Implemented in Tensorflow 2.
MIT License
136 stars 75 forks source link

bounding box #80

Closed hitch22 closed 3 years ago

hitch22 commented 3 years ago

Howdy, I recently followed the example on the doc and finetuned the pre-trained yolov4-tiny model. The resulting custom trained model seems to correctly predict the class.

However, the bounding boxes predicted by the model are all minuscule boxes resembling dots. Interestingly, the bounding boxes sit roughly at the center of any detected object and are off proportion.

I have checked my training data and the corresponding yolo txt annotations with labelImg and cv2 again and all seem to check out. I literally copied and pasted the tutorial. The only thing I messed around with is the custom dataset I prepped and they look alright upon inspection.

Is there a chance this is a bug?

I'm using the latest version 3.2.0 BTW.

hhk7734 commented 3 years ago

Would you share your model? I should be able to reproduce it to solve the problem.

hitch22 commented 3 years ago

Sure, attached is the model I trained to recognize industrial equipment and a target picture for you to run tests on. Please let me know if you need anything else.

Thanks in advance! trained_model.tar.gz

hhk7734 commented 3 years ago

What is your predicted image? is it the same below?

image

hitch22 commented 3 years ago

I am surprised you got better and larger bounding boxes. For me, somehow, I get way smaller boxes around the center point. My expectations were that the bounding boxes would cover the entirety of the equipment, not just parts of it. Is this an issue concerning training? I noticed that the total loss would plateau at 40 or something along that line. Any pointers would be appreciated.

I attached the xml and txt annotation file so you have an idea of what I mean. train_data_example.tar.gz

Thanks in advance!

What is your predicted image? is it the same below?

image

hhk7734 commented 3 years ago

I didn't change anything and just ran it.

When I train with darknet, I train 3 at the same time. (PC and 2 Colab Instances) Just watch it until the end of the training, or kill the least trained one while watching the loss change, and start a new training.

It is difficult to explain because there are various training methods and tuning methods. Ref:

phillips96 commented 3 years ago

Did you upscale the detection by the size of the image? Taking the raw bounding box will just give you a point at the centre, with the width and height as fractional. From there you need to work out the co-ords of the corners. The 'detect' function does this for you, but 'predict' does not from what I've seen.

hhk7734 commented 3 years ago
import cv2
from pathlib import Path
import numpy as np

from yolov4.tf import YOLOv4

yolo = YOLOv4()

root = Path("C://Users/dev/Downloads")

yolo.config.parse_names(root.joinpath("classes.names"))
yolo.config.parse_cfg(root.joinpath("yolov4-tiny.cfg"))

yolo.make_model()
yolo.load_weights(root.joinpath("yolov4-tiny-24000-step.weights"), weights_type="yolo")
yolo.summary(summary_type="yolo")

yolo.inference(str(root.joinpath("BOU_0010.jpg")))

Instead of yolo.inference.

cv2.namedWindow("result", cv2.WINDOW_AUTOSIZE)

frame = cv2.imread(str(root.joinpath("BOU_0010.jpg")))
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

bboxes = yolo.predict(frame_rgb, prob_thresh=0.25)

image = yolo.draw_bboxes(frame, bboxes)
cv2.imshow("result", image)

while cv2.waitKey(10) & 0xFF != ord("q"):
    pass
cv2.destroyWindow("result")
hitch22 commented 3 years ago

I tried your code and I get roughly the same bounding box as yours. I am attempting to train multiple models with different configs for now. I think the bounding box problem occurs when the target object appears relatively small in context.

import cv2
from pathlib import Path
import numpy as np

from yolov4.tf import YOLOv4

yolo = YOLOv4()

root = Path("C://Users/dev/Downloads")

yolo.config.parse_names(root.joinpath("classes.names"))
yolo.config.parse_cfg(root.joinpath("yolov4-tiny.cfg"))

yolo.make_model()
yolo.load_weights(root.joinpath("yolov4-tiny-24000-step.weights"), weights_type="yolo")
yolo.summary(summary_type="yolo")

yolo.inference(str(root.joinpath("BOU_0010.jpg")))

Instead of yolo.inference.

cv2.namedWindow("result", cv2.WINDOW_AUTOSIZE)

frame = cv2.imread(str(root.joinpath("BOU_0010.jpg")))
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

bboxes = yolo.predict(frame_rgb, prob_thresh=0.25)

image = yolo.draw_bboxes(frame, bboxes)
cv2.imshow("result", image)

while cv2.waitKey(10) & 0xFF != ord("q"):
    pass
cv2.destroyWindow("result")
hitch22 commented 3 years ago

using darknet to do the training helped out tremendously with mAP and accurate bounding box prediction, along with modifying the network architecture for detecting smaller objects in context. thanks for the advice guys!