WongKinYiu / YOLO

An MIT rewrite of YOLOv9
MIT License
292 stars 19 forks source link

[EPIC] Scope out all work that needs to be completed #3

Open Sharpz7 opened 3 months ago

Sharpz7 commented 3 months ago

Related Issues:

I envision that all required changes should be tracked, potentially using GitHub Projects.

That way, people can come and go from active development and get involved, and only a small number (1-2 people) will need to actively keep track of everything:

Note that I am not a lawyer, but I think it would be good if we make sure we cite exactly the information we have collected for all our new code.

Thanks!

First Draft

Y-T-G commented 3 months ago

Maybe we could use YOLO-X code as base? https://github.com/Megvii-BaseDetection/YOLOX

It's Apache though. Not sure if Apache can be turned into MIT or what the process is. But at least it would involve less reinventing the wheel.

Sharpz7 commented 2 months ago

This seems like a smart idea to me @WongKinYiu?

Y-T-G commented 2 months ago

I can try implementing a working version with the YOLO-X base just to get the ball rolling.

Y-T-G commented 2 months ago

Okay, here's the minimum working version using the YOLOX base: https://github.com/Y-T-G/YOLOv9-Neo

Changes:

I tried training on coco128 just for testing and the loss seemed to be decreasing. But who knows, it might be broken. If something looks seriously wrong, it probably is. I don't have a GPU to test the training. I only ran it momentarily on Colab. I just wanted to get the minimum working version up to catalyze the work,

Training: python tools/train.py -f exps/yolov9/gelanc.py -d 0 -b 16 --fp16

Prediction: python tools/demo.py image -f exps/yolov9/gelanc.py -c YOLOX_outputs/gelanc/latest_ckpt.pth --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result

Sharpz7 commented 2 months ago

If your problem is compute, I can help there.

Also note that Apache and MIT are compatible - and for most people are the same level of "open". I think going with Apache is fine.

If something is "trivially" simple, I think we can prove so by just linking a paper that talks about it / its origins as close as we can get it. We'll have to be careful about linking in comments where everything came from.

Y-T-G commented 2 months ago

@Sharpz7 Nevermind. It seems ADown was introduced by YOLOv9, so we should be able to use it since @WongKinYiu is the copyright holder in that case.

https://github.com/ultralytics/ultralytics/commit/2071776a3672eb835d7c56cfff22114707765ac6

Y-T-G commented 2 months ago

I wrote this loader which can load the weights for the backbone and neck from the ultralytics model for verification. Of course, can't load the weights for the head.

from collections import OrderedDict
from yolox.models import YOLOX, YOLOPAFPN, YOLOXHead
from ultralytics import YOLO

head_in_channels = [256, 512, 512]
num_bnecks = 1
backbone = YOLOPAFPN(1, 1, out_channels=head_in_channels, num_bnecks=num_bnecks, depthwise=False)
head = YOLOXHead(80, 1, in_channels=head_in_channels, depthwise=False)
yolov9_neo = YOLOX(backbone, head)
yolov9_ultralytics = YOLO("yolov9c.pt")

def load_yolo_weights(yolo_9x, yolo_9u, backbone_key_len=853):
    y9x_keys = list(yolo_9x.state_dict().keys())[:backbone_key_len]
    y9u_keys = list(yolo_9u.model.model.state_dict().keys())[:backbone_key_len]
    y9u_sd = yolo_9u.model.model.state_dict()
    new_sd = OrderedDict()
    for kx, ku in zip(y9x_keys, y9u_keys):
        if ".".join(kx.split(".")[-2:]) == ".".join(ku.split(".")[-2:]):
            new_sd[kx] = y9u_sd[ku]

    return new_sd

paired_sd = load_yolo_weights(yolov9_neo.backbone, yolov9_ultralytics)
yolov9_neo.backbone.load_state_dict(paired_sd, strict=False)

I verified the output using the same weights until the end of the neck and it's the same as the original implementation for Gelan-C.

But the loss doesn't improve after the first few steps. I think the YOLOX head is not compatible with it, even though YOLOX is also an anchorless detector. I haven't delved deeper into the code for the head because the code feels like a mouthful. Also, the problem with using the original YOLOv9 head is that it is a derivative of YOLOv8 head. I am not sure how to implement that and escape GPL.

Sharpz7 commented 2 months ago

Hey @Y-T-G. Can you explain in a bit more detail the state of everything? Maybe start a new issue? Especially the state of your repo yolov9-neo, is it actually not using any components from ultralytics?

Maybe a "design doc" as such to make it clear what still needs done would be great.

Also, you think that building yolov9 into yolox is the way to go yes? We should maybe go that direction.

Sharpz7 commented 2 months ago

We should also ask Yolov9 to clearly label what parts of their Codebase come from ultralytics if they don't already.

Y-T-G commented 2 months ago

Hey @Y-T-G. Can you explain in a bit more detail the state of everything? Maybe start a new issue? Especially the state of your repo yolov9-neo, is it actually not using any components from ultralytics?

Maybe a "design doc" as such to make it clear what still needs done would be great.

Also, you think that building yolov9 into yolox is the way to go yes? We should maybe go that direction.

Currently, YOLOv9-Neo doesn't use any code that wasn't introduced by @WongKinYiu. As for code that have been introduced by @WongKinYiu, I guess we can use it since it is his code if he allows it. I have added some layers from those, with some cosmetic changes. A lot of the YOLO-X code is reusable, including the layers. So I mostly integrated what was present with what wasn't.

What we have as of now:

Major parts left:

*YOLOv9-Neo currently uses YOLO-X head. This is different from the original YOLOv9 head which is a slightly modified YOLOv8 head. I haven't implemented it since it is a derivative of YOLOv8.

I previously mentioned that the loss wasn't decreasing. But it does decrease really slowly. It seems to be common issue with the YOLOX repo. To fix this, we have to somehow improve the head and the loss calculation.

After initializing the backbone with pretrained weights, I had this result after 4 epochs on a single class dataset:

Average forward time: 37.20 ms, Average NMS time: 0.97 ms, Average inference time: 38.18 ms
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.063
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.200
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.019
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.012
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.061
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.126
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.050
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.137
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.200
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.072
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.218
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.325
per class AP:
| class   | AP    |
|:--------|:------|
| person  | 6.263 |
per class AR:
| class   | AR     |
|:--------|:-------|
| person  | 20.010 |

The training parameters are defined at: exps/yolov9/gelanc.py. You can read the YOLO-X repo to understand how it works.

Sharpz7 commented 2 months ago

@Y-T-G thank you very much for writing all this up.

Hopefully it helps with future work. I am still stuck with other projects right now, but keeping this on my radar :))

Y-T-G commented 2 months ago

@Sharpz7 Yeah, I have other stuff too. Hopefully someone picks it up.

Sharpz7 commented 2 months ago

I found an implementation of yolo-v8 that is entirely free of ultralytics.

https://github.com/keras-team/keras-cv/tree/c60112e4dc38d8de35dd6d2151dd786e6de8e8f2/keras_cv/models/object_detection/yolo_v8

This could be useful?

Y-T-G commented 2 months ago

I found an implementation of yolo-v8 that is entirely free of ultralytics.

https://github.com/keras-team/keras-cv/tree/c60112e4dc38d8de35dd6d2151dd786e6de8e8f2/keras_cv/models/object_detection/yolo_v8

This could be useful?

I had come across it. But I am not sure if they're truly escaping GPL's derivative work definition. Some of the code are just the same as the one in ultralytics with just cosmetic changes and using keras functions instead of torch, which is sort of like translation.

https://softwareengineering.stackexchange.com/questions/260347/ship-of-theseus-applied-to-gpl-can-i-relicense-my-program-if-i-replace-all-of

GrantorShadow commented 2 months ago

I'll be happy to pick this up!

Sharpz7 commented 2 months ago

@Y-T-G I agree, but a few notes:

Ultralytics are aware of the keras work and have commented on the PR without concerns (https://github.com/keras-team/keras-cv/pull/1711#issuecomment-1519989260)

Also see https://github.com/keras-team/keras-cv/issues/2412 - since Keras have published this code under ApacheV2, it is on them to ensure that it actually is, not us. If they have built it from the YoloV8 Paper, and not the code, I think that is acceptable - you cannot patent a research paper.

Sharpz7 commented 2 months ago

@GrantorShadow Great! Let us know anything that you would need. A PR Format to either a YOLO-X fork if you want to do that as a base, or yolov9mit.

With a PR it is easier to review, @Y-T-G if you could do that with your work too that would be great. Then I will keep a list of "active PR's" at the top of this issue :))

ahmadmughees commented 2 months ago

@Sharpz7 I don't think so there is any yolov8 paper as Ultralytics did not publish any.

ahmadmughees commented 2 months ago

https://github.com/Chris-hughes10/Yolov7-training/ is also an interesting implementation of the YOLOv7 model if some of the modules are required.

Sharpz7 commented 2 months ago

@ahmadmughees Sadly since its under GPL I would rather avoid it :((