Open Sharpz7 opened 3 months ago
Maybe we could use YOLO-X code as base? https://github.com/Megvii-BaseDetection/YOLOX
It's Apache though. Not sure if Apache can be turned into MIT or what the process is. But at least it would involve less reinventing the wheel.
This seems like a smart idea to me @WongKinYiu?
I can try implementing a working version with the YOLO-X base just to get the ball rolling.
Okay, here's the minimum working version using the YOLOX base: https://github.com/Y-T-G/YOLOv9-Neo
Changes:
yolox/models/backbone.py
yolox/models/network_blocks.py
yolox/models/yolo_pafpn.py
ADown
layer with a different implementation (a modified version of Down
layer from UNet but with RepConv
) because I didn't find any other implementation of this layer other than from Ultralytics. It's a very simple layer and I can't think of any other ways to replicate its implementation since it is so simple.I tried training on coco128 just for testing and the loss seemed to be decreasing. But who knows, it might be broken. If something looks seriously wrong, it probably is. I don't have a GPU to test the training. I only ran it momentarily on Colab. I just wanted to get the minimum working version up to catalyze the work,
Training:
python tools/train.py -f exps/yolov9/gelanc.py -d 0 -b 16 --fp16
Prediction:
python tools/demo.py image -f exps/yolov9/gelanc.py -c YOLOX_outputs/gelanc/latest_ckpt.pth --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result
If your problem is compute, I can help there.
Also note that Apache and MIT are compatible - and for most people are the same level of "open". I think going with Apache is fine.
If something is "trivially" simple, I think we can prove so by just linking a paper that talks about it / its origins as close as we can get it. We'll have to be careful about linking in comments where everything came from.
@Sharpz7 Nevermind. It seems ADown
was introduced by YOLOv9, so we should be able to use it since @WongKinYiu is the copyright holder in that case.
https://github.com/ultralytics/ultralytics/commit/2071776a3672eb835d7c56cfff22114707765ac6
I wrote this loader which can load the weights for the backbone and neck from the ultralytics model for verification. Of course, can't load the weights for the head.
from collections import OrderedDict
from yolox.models import YOLOX, YOLOPAFPN, YOLOXHead
from ultralytics import YOLO
head_in_channels = [256, 512, 512]
num_bnecks = 1
backbone = YOLOPAFPN(1, 1, out_channels=head_in_channels, num_bnecks=num_bnecks, depthwise=False)
head = YOLOXHead(80, 1, in_channels=head_in_channels, depthwise=False)
yolov9_neo = YOLOX(backbone, head)
yolov9_ultralytics = YOLO("yolov9c.pt")
def load_yolo_weights(yolo_9x, yolo_9u, backbone_key_len=853):
y9x_keys = list(yolo_9x.state_dict().keys())[:backbone_key_len]
y9u_keys = list(yolo_9u.model.model.state_dict().keys())[:backbone_key_len]
y9u_sd = yolo_9u.model.model.state_dict()
new_sd = OrderedDict()
for kx, ku in zip(y9x_keys, y9u_keys):
if ".".join(kx.split(".")[-2:]) == ".".join(ku.split(".")[-2:]):
new_sd[kx] = y9u_sd[ku]
return new_sd
paired_sd = load_yolo_weights(yolov9_neo.backbone, yolov9_ultralytics)
yolov9_neo.backbone.load_state_dict(paired_sd, strict=False)
I verified the output using the same weights until the end of the neck and it's the same as the original implementation for Gelan-C.
But the loss doesn't improve after the first few steps. I think the YOLOX head is not compatible with it, even though YOLOX is also an anchorless detector. I haven't delved deeper into the code for the head because the code feels like a mouthful. Also, the problem with using the original YOLOv9 head is that it is a derivative of YOLOv8 head. I am not sure how to implement that and escape GPL.
Hey @Y-T-G. Can you explain in a bit more detail the state of everything? Maybe start a new issue? Especially the state of your repo yolov9-neo, is it actually not using any components from ultralytics?
Maybe a "design doc" as such to make it clear what still needs done would be great.
Also, you think that building yolov9 into yolox is the way to go yes? We should maybe go that direction.
We should also ask Yolov9 to clearly label what parts of their Codebase come from ultralytics if they don't already.
Hey @Y-T-G. Can you explain in a bit more detail the state of everything? Maybe start a new issue? Especially the state of your repo yolov9-neo, is it actually not using any components from ultralytics?
Maybe a "design doc" as such to make it clear what still needs done would be great.
Also, you think that building yolov9 into yolox is the way to go yes? We should maybe go that direction.
Currently, YOLOv9-Neo doesn't use any code that wasn't introduced by @WongKinYiu. As for code that have been introduced by @WongKinYiu, I guess we can use it since it is his code if he allows it. I have added some layers from those, with some cosmetic changes. A lot of the YOLO-X code is reusable, including the layers. So I mostly integrated what was present with what wasn't.
What we have as of now:
Major parts left:
*YOLOv9-Neo currently uses YOLO-X head. This is different from the original YOLOv9 head which is a slightly modified YOLOv8 head. I haven't implemented it since it is a derivative of YOLOv8.
I previously mentioned that the loss wasn't decreasing. But it does decrease really slowly. It seems to be common issue with the YOLOX repo. To fix this, we have to somehow improve the head and the loss calculation.
After initializing the backbone with pretrained weights, I had this result after 4 epochs on a single class dataset:
Average forward time: 37.20 ms, Average NMS time: 0.97 ms, Average inference time: 38.18 ms
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.063
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.200
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.019
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.012
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.061
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.126
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.050
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.137
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.200
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.072
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.218
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.325
per class AP:
| class | AP |
|:--------|:------|
| person | 6.263 |
per class AR:
| class | AR |
|:--------|:-------|
| person | 20.010 |
The training parameters are defined at: exps/yolov9/gelanc.py
. You can read the YOLO-X repo to understand how it works.
@Y-T-G thank you very much for writing all this up.
Hopefully it helps with future work. I am still stuck with other projects right now, but keeping this on my radar :))
@Sharpz7 Yeah, I have other stuff too. Hopefully someone picks it up.
I found an implementation of yolo-v8 that is entirely free of ultralytics.
This could be useful?
I found an implementation of yolo-v8 that is entirely free of ultralytics.
This could be useful?
I had come across it. But I am not sure if they're truly escaping GPL's derivative work definition. Some of the code are just the same as the one in ultralytics with just cosmetic changes and using keras functions instead of torch, which is sort of like translation.
I'll be happy to pick this up!
@Y-T-G I agree, but a few notes:
Ultralytics are aware of the keras work and have commented on the PR without concerns (https://github.com/keras-team/keras-cv/pull/1711#issuecomment-1519989260)
Also see https://github.com/keras-team/keras-cv/issues/2412 - since Keras have published this code under ApacheV2, it is on them to ensure that it actually is, not us. If they have built it from the YoloV8 Paper, and not the code, I think that is acceptable - you cannot patent a research paper.
@GrantorShadow Great! Let us know anything that you would need. A PR Format to either a YOLO-X fork if you want to do that as a base, or yolov9mit.
With a PR it is easier to review, @Y-T-G if you could do that with your work too that would be great. Then I will keep a list of "active PR's" at the top of this issue :))
@Sharpz7 I don't think so there is any yolov8 paper as Ultralytics did not publish any.
https://github.com/Chris-hughes10/Yolov7-training/ is also an interesting implementation of the YOLOv7 model if some of the modules are required.
@ahmadmughees Sadly since its under GPL I would rather avoid it :((
Related Issues:
I envision that all required changes should be tracked, potentially using GitHub Projects.
That way, people can come and go from active development and get involved, and only a small number (1-2 people) will need to actively keep track of everything:
Note that I am not a lawyer, but I think it would be good if we make sure we cite exactly the information we have collected for all our new code.
Thanks!
First Draft