drapado commented 5 years ago

I've tested this yolov3 implementation in pytorch https://github.com/ultralytics/yolov3. I used the same dataset and same yolov3-spp.cfg file (same of everything) for the tests. I achieved these results (consistent over several attempts):

This repo: 58.2% mAP
Pytorch repo: 64.4% mAP

AlexeyAB commented 5 years ago

Did you train your own model or default model https://pjreddie.com/media/files/yolov3-spp.weights ?
What dataset did you use to check the mAP, is it MS COCO 2014 test-dev, or val, or MS COCO 2017, or some custom dataset?
What script and commands did you use in both cases to check mAP?
Did you check mAP@0.5 or mAP@[.5, .95]?
Can you show screenshots of result(mAP)?

drapado commented 5 years ago

I train yolov3-spp with my custom dataset (4 classes)
I used: ./darknet detector map data/obj.data cfg/yolov3-spp-obj.cfg backup/best.weigths python test.py --data-cfg data/obj.data --cfg cfg/yolov3-spp-obj.cfg --weights weights/best.pt --img-size 256 --batch-size 32 --conf-thres 0.25
I checkd mAP@0.5
I don't have access to the computer right now to run again the commands, but I stored the results on a spreadsheet

AlexeyAB commented 5 years ago

Did you train 2 different models, one by using https://github.com/AlexeyAB/darknet and antoher by using https://github.com/ultralytics/yolov3 ?

How many Training and Validation images did you use?

What GPU did you use for training using Darknet?

Did you set CUDNN_HALF=1?

As claimed, there can be achived +0.1% - 0.3% mAP higher accuracy by using ultralytics/yolov3 than by using Darknet https://github.com/ultralytics/yolov3#map

There are several assumptions here:

May be training with CUDNN_HALF=1 reduces accuracy (try to train with CUDNN_HALF=0 if your GPU CC >= 7.0)
May be PyTorch has slightly better Training optimizer
May be ultralytics/yolov3 uses different resizing approach, that is more suitable for your dataset: https://github.com/AlexeyAB/darknet/issues/232#issuecomment-336955485
There are different data augmentation algorithms in Darknet and ultralytics/yolov3 https://github.com/ultralytics/yolov3#image-augmentation
There are slightly different mAP calculation algorithms: https://github.com/AlexeyAB/darknet/issues/2746#issuecomment-478348270
There is another NMS algorithm in the ultralytics/yolov3 : https://github.com/ultralytics/yolov3/issues/72#issuecomment-482569996

drapado commented 5 years ago

Yes, I trained two different models starting from darknet53.conv.74 and yolov3-spp.cfg adapted for 4 classes. I kept the rest of the hyperparameters equal in both frameworks. I have 4000 images in the train set and 1000 in the valid set.

I used an RTX 2060 so I trained with CUDNN_HALF=1, but I also added mixed precison training to the pytorch version through nvidia apex.amp.

I believe it's related with the presence of very small objects (smaller than 10x10 when img size=256). This objects are present as class 1, the class that improves the most its mAP in this pytorch implementation and also the class with more number of examples.

AlexeyAB commented 5 years ago

You can try to convert your Darknet weights-file to PyTorch weigths pt-file and check mAP with ultralytics/yolov3, in such a way you can see how different are the accuracy calculation algorithms.

I used an RTX 2060 so I trained with CUDNN_HALF=1, but I also added mixed precison training to the pytorch version through nvidia apex.amp.

Also if you try to retrain the model with CUDNN_HALF=0 and get better accuracy, I will try to find an bug if is there in the Darknet.

I just don't use Loss-scale, because I don't apply FP16 to activations, but may be it is required in any cases: https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html

drapado commented 5 years ago

It seems that you can run the pytorch inplementation with the darknet weights. I got this results by using the same weigths, cfg file and validation set

darknet mAP@0.5 = 58.01%
pytorch mAP@0.5 = 51.5%

drapado commented 5 years ago

Also if you try to retrain the model with CUDNN_HALF=0 and get better accuracy, I will try to find an bug if is there in the Darknet.

I already tried that some weeks ago and there was no performance difference, it was even slightly better with CUDNN_HALF=1

AlexeyAB commented 5 years ago

In addition to: https://github.com/AlexeyAB/darknet/issues/2914#issuecomment-482557005 There is also better NMS algorithm in the ultralytics/yolov3 : https://github.com/ultralytics/yolov3/issues/72#issuecomment-482569996

glenn-jocher commented 5 years ago

@AlexeyAB @drapado yes https://github.com/ultralytics/yolov3 accepts weights in either darknet or pytorch format in train.py, test.py and detect.py, and also computes mAP locally, which is validated against pycocotools to about 1% (i.e. we get 0.611 mAP using our repo mAP calculation vs 0.608 pycocotools with python3 test.py --weights weights/yolov3-spp.weights).

A couple points on training though:

darknet and this repo likely train COCO better than ultralytics/yolov3 at the moment, as we still need to optimize our hyperparameters (strangely the original darknet loss function performs poorly for us, or we may have implemented it suboptimally).
We see excellent training results early on which plateau after about 50 epochs https://github.com/ultralytics/yolov3/issues/205#issuecomment-482623099
We report a slightly higher COCO mAP than darknet mainly due to a modified NMS. This bumps our test mAP a few fractions of a percent only. With 'traditional' NMS we report mAPs slightly lower than darknet. Our technique merges lower confidence boxes into the main box using a weighted mean method rather than simply deleting them. This technique is applicable to any object detection network, and is independent of the network itself (i.e. requires no architecture changes to yolo).
mAP in general is a terrible metric for real world usability, as it is optimized at extremely low confidence thresholds, creating a mess of FPs. F1 is more suitable I believe. See https://github.com/ultralytics/yolov3/issues/188

AlexeyAB commented 5 years ago

@glenn-jocher Hi,

Can you provide short manual how to convert cfg/weigths -> pt and and convert back pt -> cfg/weigths to make it clear to most users? I want to link to it.

mAP in general is a terrible metric for real world usability, as it is optimized at extremely low confidence thresholds, creating a mess of FPs. F1 is more suitable I believe. See ultralytics/yolov3#188

Do you mean it is better to calculate many F1-scores for each threshold from 0.0 to 1.0 (for example) with step=0.01 and get the highest value? Since optimal threshold for Yolo is ~0.25 while optimal threshold for SSD/DSSD is ~0.5 - 0.8, so we can't compare these two models with the same Confidence-threshold.

glenn-jocher commented 5 years ago

@AlexeyAB yes I will create a short conversion function. There is already a nice pathway to convert weights/cfg to .pt, but we don't have an easy way to convert back to .weights/cfg yet though!!

About the mAP, it seems I get the best mAP@0.5 on COCO by testing at extremely low confidence thresholds, about conf_thres=0.001. But if you actually look at those pictures, the result is terrible, there are about 10 FPs for every 1 TP (about 0.10 precision). So in the ultralytics/yolov3 repo we test at conf_thres=0.001 but we detect at conf_thres=0.5.

This is an example using yolov3-tiny, from https://github.com/ultralytics/yolov3/issues/188. The top pictures, run at --conf-thres 0.001 produce much higher mAP using pycocotools. So I feel they've set a terrible metric, since now everyone is chasing mAP as some sort of end-all be-all metric for how well their object detector trained, when they are actually optimizing their system to produce junk like in the first examples:

ultralytics/yolov3 `yolov3-tiny.weights`	darknet `yolov3-tiny.weights`
`--conf-thres 0.001`	`--conf-thres 0.001`

ultralytics/yolov3 `yolov3-tiny.weights`	darknet `yolov3-tiny.weights`
`--conf-thres 0.50`	`--conf-thres 0.50`

AlexeyAB commented 5 years ago

@glenn-jocher

About the mAP, it seems I get the best mAP@0.5 on COCO by testing at extremely low confidence thresholds, about conf_thres=0.001. But if you actually look at those pictures, the result is terrible, there are about 10 FPs for every 1 TP (about 0.10 precision). So in the ultralytics/yolov3 repo we test at conf_thres=0.001 but we detect at conf_thres=0.5.

This is an example using yolov3-tiny, from https://github.com/ultralytics/yolov3/issues/188. The top pictures, run at --conf-thres 0.001 produce much higher mAP using pycocotools. So I feel they've set a terrible metric, since now everyone is chasing mAP as some sort of end-all be-all metric for how well their object detector trained, when they are actually optimizing their system to produce junk like in the first examples:

mAP is calculated for all possible thresholds.

So when you set the conf_thres=0.001 then you just set the lowest threshold, so mAP will be calculated from threshold=0.001 to 1.0 with some step.

Why should we take into account detections with very low and very high thresholds:

There are many tasks where a very high Recall is required (with very low threshold), where we must detect all the objects necessarily, even if there are a lot of FPs that we will reject later: diseases detection, obstacles detection, detection of attacks and incidents ...
Vice versa, there are many tasks where a very high Precision is required (with very high threshold), we need to detect only necessary objects, and we should not detect objects if we are not 99% sure: weapon guidance systems, detection of road bending for self-driving cars...
Different models and frameworks have different optimal thresholds: (SSD ~0.5-0.8), (Yolo ~0.25), ...

So to create the single rating of models we should use mAP that includes Precision and Recall for all possible thresholds.

That is why mAP is used in the most detection ratings/competitions: Pascal VOC, MS COCO, ImageNet...

Actually for MS COCO the mAP is calculated for 101 different thresholds, URLs at the bottom of the first message: https://github.com/AlexeyAB/darknet/issues/2746

We get 101 points on Precision-Recall curve, for Recall = 0.0 - 1.0 with step 0.01. So for each of these points there will be a difference threshold: https://github.com/AlexeyAB/darknet/blob/099b71d1de6b992ce8f9d7ff585c84efd0d4bf94/src/detector.c#L982-L1002

glenn-jocher commented 5 years ago

@AlexeyAB thanks for the excellent summary of mAP and why it's important across different applications! Very educational for everyone. It is true that you can tune your P/R ratio to suit your needs as move up the conf_threshand yes I see how the current mAP metric tests against all the various thresholds above the set value.

I've added a simple conversion function to export from both pytorch to darknet format and vice versa now. The process is very simple:

git clone https://github.com/ultralytics/yolov3 && cd yolov3

# darknet to pytorch
python3  -c "from models import *; convert('cfg/yolov3-spp.cfg', 'weights/yolov3-spp.weights')"
Success: converted 'weights/yolov3-spp.weights' to 'converted.pt'

# pytorch to darknet
python3  -c "from models import *; convert('cfg/yolov3-spp.cfg', 'weights/yolov3-spp.pt')"
Success: converted 'weights/yolov3-spp.pt' to 'converted.weights'

AlexeyAB commented 5 years ago

@glenn-jocher Thank you! I will add URL to Readme.

glenn-jocher commented 5 years ago

@AlexeyAB Great! You could link to our iDetection iOS app also if you want, it runs YOLOv3-SPP 320 realtime (about 15-20 FPS) on devices with the newest Apple A12 processor (iPhone Xs, Xr, etc.)

It has a 5 star rating and over 700 downloads in the last two months. The screenshots below are from a previous release at 416 inference, which reduces the framerate to about 11 FPS. We are working on introducing rectangular inference as well, which could theoretically boost the FPS by 40% on HD (16:9) aspect ratios vs square inference, adding pinch to zoom functionality like the native camera app, and a few other updates.

Older devices can run the app as well, but will suffer as the model year goes back. An iPhone 6s for example will run about 0.3 FPS. Apple has really been making leaps with their Neural Engine, which is at 5 TOPS now.

aditbhrgv commented 5 years ago

Hello @glenn-jocher & @AlexeyAB ,

I was trying to reproduce training & evaluation results on my custom dataset from Darknet C implementation from alexeyAB repo. I get worse performance results on ultralytics/yolov3 implementation. Could you please let me know how to reproduce my Darknet C results in Pytorch?

Training dataset: ~7800 images Test dataset: ~2560 images

Command which I ran to compute the metrics:

Darknet C implementation: ./build/darknet detector map cfg/hld.data cfg/yolov3-tiny_3l.cfg weights/yolov3-tiny_3l_20000.weights

Converted the .weights file from Darknet to .pt : python3 -c "from models import *; convert('cfg/yolov3-tiny_3l.cfg', 'weights/yolov3-tiny_3l_20000.weights')" to get converted.pt and then ran python test.py --cfg=cfg/yolov3-tiny_3l.cfg --data-cfg=cfg/obj.data --weights=converted.pt --img-size=608 --conf-thres=0.25 --batch-size=64

ultralytics/yolov3 Pytorch trained model implementation: python test.py --cfg=cfg/yolov3-tiny_3l.cfg --data-cfg=cfg/obj.data --weights=weights/best.pt--img-size=608 --conf-thres=0.25 --batch-size=64

Metrics	Darknet trained model	Converted model from Darknet .weights to .pth	ultralytics/yolov3 trained model
@0.25 conf-thresh

Precision	0.78	59.5	0.45
Recall	0.72	57.7	0.643
F1 score	0.75	58.6	0.53
MAP@0.5	0.7435	56.3	0.553

Thanks

aditbhrgv commented 5 years ago

Also, when I tried to convert the Pytorch model to darknet .weights format , I get no detections in Darknet.

python3 -c "from models import *; convert('cfg/yolov3-tiny_3l.cfg', 'weights/best.pt')"

**calculation mAP (mean average precision)... 2560 detections_count = 0, unique_truth_count = 5009 class_id = 0, name = tl_pair, ap = 0.00% (TP = 0, FP = 0) class_id = 1, name = hl_pair, ap = 0.00% (TP = 0, FP = 0)

for thresh = 0.25, precision = -nan, recall = 0.00, F1-score = -nan for thresh = 0.25, TP = 0, FP = 0, FN = 5009, average IoU = 0.00 %

IoU threshold = 50 %, used Area-Under-Curve for each unique Recall mean average precision (mAP@0.50) = 0.000000, or 0.00 % Total Detection Time: 84.000000 Seconds

Set -points flag: -points 101 for MS COCO -points 11 for PascalVOC 2007 (uncomment difficult in voc.data) -points 0 (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset**

glenn-jocher commented 5 years ago

@aditbhrgv generally testing for mAP computation (to get the results shown in https://github.com/ultralytics/yolov3#map) should be done at extremely low conf_thres, such as the default value in test.py of 0.001.

We've not actually tried using converted models, so this is an interesting finding. What happens if you convert the official yolov3.pt model from https://drive.google.com/drive/folders/1uxgUBemJVw9wZsdpboYbzUN4bcRhsuAI to .weights format and test that?

aditbhrgv commented 5 years ago

@aditbhrgv generally testing for mAP computation (to get the results shown in https://github.com/ultralytics/yolov3#map) should be done at extremely low conf_thres, such as the default value in test.py of 0.001.

Actually, I am not interested in mAP, I just care about comparable P, R and F1 scores in both the implementations for a particular threshold. (0.25 in above example.). I wonder what could be implementation differences on my custom dataset which lead to the above results.

aditbhrgv commented 5 years ago

We've not actually tried using converted models, so this is an interesting finding. What happens if you convert the official yolov3.pt model from https://drive.google.com/drive/folders/1uxgUBemJVw9wZsdpboYbzUN4bcRhsuAI to .weights format and test that?

Hello @AlexeyAB , I tried to test this with official yolov3.pt and test on a single image(dog.jpg) and still can't get any detections Command:

/build/darknet detector test cfg/coco.data cfg/yolov3.cfg /home/Darknet2Pytorch/yolov3/converted.weights data/dog.jpg

There's definitely some problem in converting from Pytorch to Darknet weights.

THanks

glenn-jocher commented 5 years ago

@aditbhrgv ah buddy I think you are confusing the extensions and repositories a bit:

*.pt is PyTorch format, can only be used at ultralytics/yolov3
*.weights is Darknet format, can be used at both ultralytics/yolov3 and AlexeyAB/darknet

With the ultralytics/yolov3 repo the commands to detect the default images (using rectangular inference at 416 pixels :) with 1) original darknet yolov3-spp.weights and 2) darknet converted to pytorch converted.pt weights, and lastly 3) pytorch converted back to darknet as converted.weights. This round trip should fully verify the conversion functionality I believe:

# 1) original darknet weights ------------------------------------------------------------------
python3 detect.py --weights weights/yolov3-spp.weights  # original darknet weights
Namespace(cfg='cfg/yolov3-spp.cfg', conf_thres=0.5, data_cfg='data/coco.data', images='data/samples', img_size=416, nms_thres=0.5, weights='weights/yolov3-spp.weights')
Using CPU
image 1/2 data/samples/bus.jpg: 416x320 1 handbags, 3 persons, 1 buss, Done. (0.755s)
image 2/2 data/samples/zidane.jpg: 256x416 1 ties, 2 persons, Done. (0.607s)

# 2) converted to pytorch ---------------------------------------------------------------------
python3  -c "from models import *; convert('cfg/yolov3-spp.cfg', 'weights/yolov3-spp.weights')"
Success: converted 'weights/yolov3-spp.weights' to 'converted.pt'

python3 detect.py --weights converted.pt  # converted to pytorch
Namespace(cfg='cfg/yolov3-spp.cfg', conf_thres=0.5, data_cfg='data/coco.data', images='data/samples', img_size=416, nms_thres=0.5, weights='converted.pt')
Using CPU
image 1/2 data/samples/bus.jpg: 416x320 1 handbags, 3 persons, 1 buss, Done. (0.749s)
image 2/2 data/samples/zidane.jpg: 256x416 1 ties, 2 persons, Done. (0.588s)

# 3) converted back to darknet ---------------------------------------------------------------
python3  -c "from models import *; convert('cfg/yolov3-spp.cfg', 'converted.pt')"
Success: converted 'converted.pt' to 'converted.weights'

python3 detect.py --weights converted.weights  # converted back to darknet
Namespace(cfg='cfg/yolov3-spp.cfg', conf_thres=0.5, data_cfg='data/coco.data', images='data/samples', img_size=416, nms_thres=0.5, weights='converted.weights')
Using CPU
image 1/2 data/samples/bus.jpg: 416x320 1 handbags, 3 persons, 1 buss, Done. (0.749s)
image 2/2 data/samples/zidane.jpg: 256x416 1 ties, 2 persons, Done. (0.594s)

bus.jpg	zidane.jpg

aditbhrgv commented 5 years ago

@glenn-jocher THanks for the clarification. I was thinking I could use the "converted.weights" from Pytorch in Darknet C implementation.

Just a last quick question, how can I reproduce the results on my custom dataset as Darknet C Implementation in Pytorch implementation ? (see table here https://github.com/AlexeyAB/darknet/issues/2914#issuecomment-487479712) .

I didn't use multi-scale training, neither in Darknet C nor in Pytorch implementation.

Metrics Darknet trained model Converted model from Darknet .weights to .pth ultralytics/yolov3 trained model
@0.25 conf-thresh

Precision 0.78 59.5 0.45
Recall 0.72 57.7 0.643
F1 score 0.75 58.6 0.53
MAP@0.5 0.7435 56.3 0.553

AlexeyAB commented 5 years ago

@aditbhrgv Hi,

Hello @glenn-jocher & @AlexeyAB ,

I was trying to reproduce training & evaluation results on my custom dataset from Darknet C implementation from alexeyAB repo. I get worse performance results on ultralytics/yolov3 implementation. Could you please let me know how to reproduce my Darknet C results in Pytorch?

Training dataset: ~7800 images Test dataset: ~2560 images

Command which I ran to compute the metrics:

Darknet C implementation: ./build/darknet detector map cfg/hld.data cfg/yolov3-tiny_3l.cfg weights/yolov3-tiny_3l_20000.weights

Can you attach yolov3-tiny_3l.cfg file? (rename it to cfg-file and attach).

Try to test official yolov3.pt on https://github.com/pjreddie/darknet instead of https://github.com/AlexeyAB/darknet does it work?

Hello @AlexeyAB , I tried to test this with official yolov3.pt and test on a single image(dog.jpg) and still can't get any detections Command:

/build/darknet detector test cfg/coco.data cfg/yolov3.cfg /home/Darknet2Pytorch/yolov3/converted.weights data/dog.jpg

There's definitely some problem in converting from Pytorch to Darknet weights.

THanks

AlexeyAB commented 5 years ago

@AlexeyAB Great! You could link to our iDetection iOS app also if you want, it runs YOLOv3-SPP 320 realtime (about 15-20 FPS) on devices with the newest Apple A12 processor (iPhone Xs, Xr, etc.)

It has a 5 star rating and over 700 downloads in the last two months. The screenshots below are from a previous release at 416 inference, which reduces the framerate to about 11 FPS. We are working on introducing rectangular inference as well, which could theoretically boost the FPS by 40% on HD (16:9) aspect ratios vs square inference, adding pinch to zoom functionality like the native camera app, and a few other updates.

Older devices can run the app as well, but will suffer as the model year goes back. An iPhone 6s for example will run about 0.3 FPS. Apple has really been making leaps with their Neural Engine, which is at 5 TOPS now.

@glenn-jocher Hi,

That's great! I will add URL.

We are working on introducing rectangular inference as well, which could theoretically boost the FPS by 40% on HD (16:9) aspect ratios vs square inference, adding pinch to zoom functionality like the native camera app, and a few other updates.

Do you mean that you currently uses square network size (320x320) and uses letter_box resizing with padding? https://github.com/AlexeyAB/darknet/issues/232#issuecomment-336955485

And you will add ability for rectangle network size 16:9 (320x192 or 576x320) and will use simple resize without padding?

Did you try to implement XNOR-net on ARM/ Apple A12 processor? https://github.com/AlexeyAB/darknet/issues/2365#issuecomment-462923756

glenn-jocher commented 5 years ago

@AlexeyAB we just got it done today!!! See https://github.com/ultralytics/yolov3/issues/232#issuecomment-487692744.

To answer your question yes, previously our app was running at 416x416 with letterboxing vertical 4k iPhone Xs video (the 4k video was resized to 234x416 and then padded/letterboxed to 416x416). This ran about 11 FPS. We reduced this to 320x320 to improve performance, and this ran at about 18 FPS. This is the current v4 app available for download today on the app store.

After our rectangular inference builds the app can now run YOLOv3-SPP at 30FPS 192x320, or 20FPS 256x416. We still letterbox/pad the short dimension to the nearest 32 multiple though. So for example the 4k video is resized to 234x416 (width x height), and then padded with 11 pixels on the left + 11 on the right to round out a multiple of 32: 256x416.

I don't know what XNOR-net is though. Here is an actual screenshot from today in Madrid, with a 1.15X zoom factor also (we enabled pinch-to-zoom functionality as well!! :)

aditbhrgv commented 5 years ago

Can you attach yolov3-tiny_3l.cfg file? (rename it to cfg-file and attach).

Hello @AlexeyAB Please find the attached cfg file. yolov3-tiny_3l.cfg.txt

gwestner94 commented 5 years ago

Hi @glenn-jocher, I am having the same issue as @aditbhrgv when making the round trip from: alexey/darknet -> pytorch -> alexey/darknet using the supplied pytorch yolov3 model as well as custom trained pytorch yolov3 models.

After the conversion nothing is detected.

I figured out that after changing the header information in the weights file using a tool like https://linux.die.net/man/1/vbindiff the output in alexey/darknet for a default tiny-yolov3 (https://pjreddie.com/media/files/yolov3-tiny.weights) the correct output is reproduced on dog.jpg.

But when converting a custom model I suffer a big accuracy loss (scores are reduced almost by 0.5).

The weird thing is that, when using my custom model or default tiny yolov3 after conversion on the pjreddie version of darknet, the network produces the right output after the vbindiff change.

@AlexeyAB is there a difference, that you are aware of, between the pjreddie reposiotry and yours that could cause such a mismatch?

@glenn-jocher Is there some reason why you don't preserve the header information after conversion? Is your conversion tested on the AlexeyAB/darknet version?

Thank you very much, this would clear up a lot for me

glenn-jocher commented 5 years ago

@gwestner94 we can test out the conversion mAPs. The commands (and saved outputs) are here. All 3 results are identical, performing the mAP calculation using ultralytics/yolov3. The original yolov3-spp.weights was downloaded from https://pjreddie.com/media/files/yolov3-spp.weights.

This mAP round-trip should be reproducible in our Google Colab Notebook.

If the headers are different, perhaps the header may play a role when using this repo. Feel free to submit a PR for header inclusion over at ultralytics/yolov3 if you'd like.

git clone https://github.com/ultralytics/yolov3
cd yolov3

# 1) original darknet weights ------------------------------------------------------------------
python3 test.py --weights weights/yolov3-spp.weights --save-json
#  Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.566

# 2) converted to pytorch ---------------------------------------------------------------------
python3  -c "from models import *; convert('cfg/yolov3-spp.cfg', 'weights/yolov3-spp.weights')"
# Success: converted 'weights/yolov3-spp.weights' to 'converted.pt'
python3 test.py --weights converted.pt --save-json 
#  Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.566

# 3) converted back to darknet ---------------------------------------------------------------
python3  -c "from models import *; convert('cfg/yolov3-spp.cfg', 'converted.pt')"
# Success: converted 'converted.pt' to 'converted.weights'
python3 test.py --weights converted.weights --save-json 
#  Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.566

EDIT 1: @gwestner94 after re-reading your question a second test would be to perform the same round trip using an AlexeyAB/darknet trained network. I just so happen to have one of these. I can try the round trip again with it later.

EDIT 2: Round trip successfully performed on custom dataset trained on AlexeyAB/darknet.

# 1) original darknet weights ------------------------------------------------------------------
python3 test.py --weights ../darknet/backup/yolov3-spp-sm2-1cls_5000.weights --cfg cfg/yolov3-spp-sm2-1cls.cfg --data ../supermarket2/supermarket2.data
#               Class    Images   Targets         P         R       mAP        F1
# Computing mAP: 100%|██████████████████████████████████| 2/2 [00:02<00:00,  1.72s/it]
#                 all        25       479     0.486     0.971     0.868     0.648

# 2) converted to pytorch ---------------------------------------------------------------------
python3  -c "from models import *; convert('cfg/yolov3-spp-sm2-1cls.cfg', '../darknet/backup/yolov3-spp-sm2-1cls_5000.weights')"
# Success: converted '../darknet/backup/yolov3-spp-sm2-1cls_5000.weights' to 'converted.pt'
python3 test.py --weights converted.pt --cfg cfg/yolov3-spp-sm2-1cls.cfg --data ../supermarket2/supermarket2.data
#               Class    Images   Targets         P         R       mAP        F1
# Computing mAP: 100%|██████████████████████████████████| 2/2 [00:02<00:00,  1.72s/it]
#                 all        25       479     0.486     0.971     0.868     0.648

# 3) converted back to darknet ---------------------------------------------------------------
python3  -c "from models import *; convert('cfg/yolov3-spp-sm2-1cls.cfg', 'converted.pt')"
# Success: converted 'converted.pt' to 'converted.weights'
python3 test.py --weights converted.weights --cfg cfg/yolov3-spp-sm2-1cls.cfg --data ../supermarket2/supermarket2.data
#               Class    Images   Targets         P         R       mAP        F1
# Computing mAP: 100%|██████████████████████████████████| 2/2 [00:02<00:00,  1.65s/it]
#                 all        25       479     0.486     0.971     0.868     0.648

gwestner94 commented 5 years ago

Thank you for your feedback! I will look into the header specifics and give you an update on your repository when I find a solution. It looks like the problem can be solved by correct header information.

Sudhakar17 commented 5 years ago

I tested the yolo-v3 model using COCO-Val data in both darknet and pytorch(ultralytics).

In Pytorch framework:

1.Yolo-v3.weights (original from darknet) -- 54.2 % mAP 2.Yolo-v3_converted.pt (converted using ultralytics code) -- 54.2% 3.Yolo-v3_converted.weights(converted back to original weights) -- 54.2%

In Darknet framework:

Yolo-v3.weights (original) -- 54.37 % Yolo_v3_converted.weights(darknet-->pytorch-->darknet weights) -- 0%

save_weights method from ultralytics_code:

`def save_weights(self, path='model.weights', cutoff=-1):

Converts a PyTorch model to Darket format (.pt to .weights)

# Note: Does not work if model.fuse() is applied
**with open(path, 'wb') as f:
    self.header_info[3] = self.seen  # number of images seen during training
    self.header_info.tofile(f)**

    # Iterate through layers
    for i, (module_def, module) in enumerate(zip(self.module_defs[:cutoff], self.module_list[:cutoff])):
        if module_def['type'] == 'convolutional':
            conv_layer = module[0]
            # If batch norm, load bn first
            if module_def['batch_normalize']:
                bn_layer = module[1]
                bn_layer.bias.data.cpu().numpy().tofile(f)
                bn_layer.weight.data.cpu().numpy().tofile(f)
                bn_layer.running_mean.data.cpu().numpy().tofile(f)
                bn_layer.running_var.data.cpu().numpy().tofile(f)
            # Load conv bias
            else:
                conv_layer.bias.data.cpu().numpy().tofile(f)
            # Load conv weights
            conv_layer.weight.data.cpu().numpy().tofile(f)`

It uses the header info to store it in the weights file. Can you please tell us that what went wrong in the .weights file conversion? @glenn-jocher

what do u mean "correct header information"? @gwestner94

glenn-jocher commented 5 years ago

@Sudhakar17 I don't believe anything went wrong with the weights conversion, as you can see from your own pytorch framework experiment. I myself don't have knowledge of how the headers are used in this AlexeyAB/darknet repository, that would be a question for @AlexeyAB. We do not use them at all in https://github.com/ultralytics/yolov3.

AlexeyAB commented 5 years ago

@Sudhakar17 Hi,

Can you share (f.e. via Google-disk) 4 files - I will check the difference:

yolo-v3.weights (original from darknet)
yolo-v3_converted.pt (converted using ultralytics code)
yolo-v3_converted.weights
yolov3.cfg (to be sure that we use exactly the same model)

glenn-jocher commented 5 years ago

@AlexeyAB thanks! We presently create a header 5 values long of int32s, and then write the number of images seen in index 3, everything else is left as zeros. Are there any other variables that should be written to this header when saving to a *.weights file? Is each value 32 bits?

# Needed to write header when saving *.weights
self.header_info = np.zeros(5, dtype=np.int32)  # First five are header values
self.header_info[3] = seen  # number of images seen during training

AlexeyAB commented 5 years ago

@glenn-jocher

The first 3 int32_t 32-bit-values - version (major=0, minor=2, revision=5 - you should set): https://github.com/AlexeyAB/darknet/blob/55dcd1bcb8d83f27c9118a9a4684ad73190e2ca3/src/parser.c#L1094-L1096
Then there is 1 uint64_t 64-bit-value - number of images seen during training: https://github.com/AlexeyAB/darknet/blob/55dcd1bcb8d83f27c9118a9a4684ad73190e2ca3/src/parser.c#L1097

You should use these values major=0, minor=2, revision=5. Because old version 0.1.0 used uint32_t for seen instead of uint64_t and the header was shorter by 4 bytes.

Sudhakar17 commented 5 years ago

@AlexeyAB I am traveling at the moment. I used original yolo-v3 model and the same cfg file. mAP values will be the different one since I didn't update my local darknet repository. I update my repository and rerun the converted model later. Is this header info used anywhere for calculating mAP? @glenn-jocher

Sudhakar17 commented 5 years ago

I updated the darknet repository and run the yolo-v3_converted.weight. It's not working. Any new updates? @AlexeyAB @glenn-jocher

AlexeyAB commented 5 years ago

@glenn-jocher Hi, did you fix header (version) in your conversion script? https://github.com/AlexeyAB/darknet/issues/2914#issuecomment-496675346

glenn-jocher commented 5 years ago

@AlexeyAB @Sudhakar17 I just fixed this now in https://github.com/ultralytics/yolov3/commit/d7a28bd9f74d922216e06de3dde5f981b3002bd4

@Sudhakar17 you should now be able to run pytorch exported models in darknet.

AlexeyAB / darknet

Worse performance in this repo than in a yolov3 pytorch implementation #2914

/build/darknet detector test cfg/coco.data cfg/yolov3.cfg /home/Darknet2Pytorch/yolov3/converted.weights data/dog.jpg

/build/darknet detector test cfg/coco.data cfg/yolov3.cfg /home/Darknet2Pytorch/yolov3/converted.weights data/dog.jpg

Converts a PyTorch model to Darket format (.pt to .weights)