keras-team / keras-cv

Industry-strength Computer Vision workflows with Keras
Other
1.01k stars 331 forks source link

Add YOLO-V3 model #622

Closed innat closed 1 year ago

innat commented 2 years ago

Short Description

AKA, You Only Look Once. A strong object detection model, described in the following paper.

Papers

https://arxiv.org/abs/1804.02767?fbclid=IwAR3A2nyK8EPa-JcoGp_N6tNqVkKOmy2J1ip5AYcEki5FzkZ62E3z6tbNSy0

Published on: 2018 Cited by 14236 (until)

Existing Implementations

Here is one well-know and strong reference

https://github.com/zzh8829/yolov3-tf2 by @zzh8829

Other Information

https://github.com/pythonlessons/TensorFlow-2.x-YOLOv3

LukeWood commented 2 years ago

I'm not sure we'll be supporting the earlier YOLOs, such as v3

innat commented 2 years ago

Is there any reason why v3 might not be supported despite having smart citation number? Also, recent yolo based models like yolo-v4, yolo-r, yolo-x, yolo-v7, all contains a reference of v3 model.

LukeWood commented 2 years ago

is there ever a case where users would want to use v3 over v7?

DavidLandup0 commented 2 years ago

I'm with LukeWood. There's no need to have v3 support over faster, more accurate models.

innat commented 2 years ago

Is there ever a case where users would want to use v3 over v7? There's no need to have v3 support over faster, more accurate models.

IMHO, it's a bad argument. If v7 is included now, and in the future when more new yolo versions will be available, that doesn't mean we will remove the older version.

YOLO-V3 is maintained in the tf-model-garden, and I don't see any reasonable cause not to have it in keras-cv.

DavidLandup0 commented 2 years ago

The garden isn't up to date. Here's an excerpt of the description there:

YOLO v3 and v4 serve as the most up to date and capable versions of the YOLO network group.

v4 isn't "official" (by Redmon et al.) so it's not like they limited themselves to the original authors. v5 should be in the list - it's been there since 2020. v7 is understandably not there, it was released a few weeks ago.

They seem to want to add the latest (even though they aren't). Just as YOLOv7 should be added now that it's new and up-to-date. Stuff that's added once tends to get kept for longer than the advancement in technology would mandate. Yeah, if we add v7 now, we won't remove it in the future, but we'd want to add vN that's relevant at that time and discourage use of v7 when it's clearly and objectively performing worse than alternatives.

The point of KerasCV is to make training SoTA models easier. YOLOv3 isn't SoTA. YOLOv7 is.

To that end, I'd argue that VGGNets shouldn't be part of newer frameworks, since they've long been outdone, and the methodology is old and inefficient enough that it's bad practice to use them in modern settings. The design choices have shifted away from VGGNets enough to (IMO) warrant their avoidance.

Many people looking to apply CV to their domain don't want to spend days in researching the best architectures or use cases. IMO, If we want to lower the barrier to entry to most, we'll want to take the guesswork out of the equation. Way too many papers in medicine and biology use VGGNets in 2022 and most of them could be improved with simple, standard preprocessing and augmentation steps, as well as newer architectures. Supporting outdated models, IMO, makes it harder for CV to be adopted by the wider community (and other fields of science).

bhack commented 2 years ago

I think that we are a little bit loosing the scope of Keras-cv manifesto in this thread.

The main goal here is to share reusable subcomponents to build the next network in the same/proxy family.

So I think it is not important if we start with V3 or V7 but is that tomorrow we have reusable components and the internal API (from V3/V4/V7 ?) to quickly build V8.

Cause here SOTA is really a quickly moving target and often many papers build over the previous archs introducing new elements.

If we have a strong internal modularity we could be on par of the competitor to release new SOTA model.

So the point is if we start with V7 are we going to have 50%, 60%, 70%, 80% of reusable modules from the previous Yolo versions?

LukeWood commented 2 years ago

I think that we are a little bit loosing the scope of Keras-cv manifesto in this thread.

The main goal here is to share reusable subcomponents to build the next network in the same/proxy family.

So I think it is not important if we start with V3 or V7 but is that tomorrow we have reusable components and the internal API (from V3/V4/V7 ?) to quickly build V8.

Cause here SOTA is really a quickly moving target and often many papers build over the previous archs introducing new elements.

If we have a strong internal modularity we could be on par of the competitor to release new SOTA model.

So the point is if we start with V7 are we going to have 50%, 60%, 70%, 80% of reusable modules from the previous Yolo versions?

Great points Stefano! I think thats a great take on the situation.

ayulockin commented 2 years ago

Great pointers by @bhack. To add two cents to this exact philosophy - @soumik12345 and I are working on implementing YOLOv2 (to learn and share mostly but also come up with modular APIs).

tanzhenyu commented 2 years ago

follow-up comment here, we should support YOLOv3 as it is a great demonstration for "reusable components", and YOLOv3 is definitely still heavily used, compared to more recent models (2022 is so-called the year of YOLO)

hnanacc commented 2 years ago

is there ever a case where users would want to use v3 over v7?

Most of the research projects in my lab use YoloV3 for object detection. I am not sure about why, but I'll post here after I ask my supervisor.

EDIT: Because it's popular. More reference implementations, most benchmarks/papers use it and more help if we want to do something extra.

hnanacc commented 2 years ago

I'm working on a related task, if someone hasn't already started the work, I can take this up.

tanzhenyu commented 2 years ago

I'm working on a related task, if someone hasn't already started the work, I can take this up.

That's great, please go ahead. We don't assign issue for now so that anyone with bandwidth can take it. One thing I would like to ask is, please make sure you can re-use components that is currently being used in FasterRCNN, such as box matchers, anchor generators, _targer_gather, etc

hnanacc commented 2 years ago

I'm working on a related task, if someone hasn't already started the work, I can take this up.

That's great, please go ahead. We don't assign issue for now so that anyone with bandwidth can take it. One thing I would like to ask is, please make sure you can re-use components that is currently being used in FasterRCNN, such as box matchers, anchor generators, _targer_gather, etc

Great, I will post my approach before starting the implementation, just to make sure I'm following all the pointers.

YELKHATTABI commented 1 year ago

Hello, I was trying work with YoloV3 on Keras as well, using https://github.com/zzh8829/yolov3-tf2 by @zzh8829 The main struggle that I found is how to compute validation metrics during training

Is there a way to have some kind of dynamic graph where the NMS is not used during the training, and using only when computing metrics?

ianstenbit commented 1 year ago

given that YOLOV8 is available now, this is not high-value to add