AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.65k stars 7.96k forks source link

Worse performance in this repo than in a yolov3 pytorch implementation #2914

Open drapado opened 5 years ago

drapado commented 5 years ago

I've tested this yolov3 implementation in pytorch https://github.com/ultralytics/yolov3. I used the same dataset and same yolov3-spp.cfg file (same of everything) for the tests. I achieved these results (consistent over several attempts):

AlexeyAB commented 5 years ago
drapado commented 5 years ago
AlexeyAB commented 5 years ago

Did you train 2 different models, one by using https://github.com/AlexeyAB/darknet and antoher by using https://github.com/ultralytics/yolov3 ?

How many Training and Validation images did you use?

What GPU did you use for training using Darknet?

Did you set CUDNN_HALF=1?

As claimed, there can be achived +0.1% - 0.3% mAP higher accuracy by using ultralytics/yolov3 than by using Darknet https://github.com/ultralytics/yolov3#map


There are several assumptions here:

drapado commented 5 years ago

Yes, I trained two different models starting from darknet53.conv.74 and yolov3-spp.cfg adapted for 4 classes. I kept the rest of the hyperparameters equal in both frameworks. I have 4000 images in the train set and 1000 in the valid set.

I used an RTX 2060 so I trained with CUDNN_HALF=1, but I also added mixed precison training to the pytorch version through nvidia apex.amp.

I believe it's related with the presence of very small objects (smaller than 10x10 when img size=256). This objects are present as class 1, the class that improves the most its mAP in this pytorch implementation and also the class with more number of examples.

AlexeyAB commented 5 years ago

You can try to convert your Darknet weights-file to PyTorch weigths pt-file and check mAP with ultralytics/yolov3, in such a way you can see how different are the accuracy calculation algorithms.


I used an RTX 2060 so I trained with CUDNN_HALF=1, but I also added mixed precison training to the pytorch version through nvidia apex.amp.

Also if you try to retrain the model with CUDNN_HALF=0 and get better accuracy, I will try to find an bug if is there in the Darknet.

I just don't use Loss-scale, because I don't apply FP16 to activations, but may be it is required in any cases: https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html

drapado commented 5 years ago

It seems that you can run the pytorch inplementation with the darknet weights. I got this results by using the same weigths, cfg file and validation set

drapado commented 5 years ago

Also if you try to retrain the model with CUDNN_HALF=0 and get better accuracy, I will try to find an bug if is there in the Darknet.

I already tried that some weeks ago and there was no performance difference, it was even slightly better with CUDNN_HALF=1

AlexeyAB commented 5 years ago

In addition to: https://github.com/AlexeyAB/darknet/issues/2914#issuecomment-482557005 There is also better NMS algorithm in the ultralytics/yolov3 : https://github.com/ultralytics/yolov3/issues/72#issuecomment-482569996

glenn-jocher commented 5 years ago

@AlexeyAB @drapado yes https://github.com/ultralytics/yolov3 accepts weights in either darknet or pytorch format in train.py, test.py and detect.py, and also computes mAP locally, which is validated against pycocotools to about 1% (i.e. we get 0.611 mAP using our repo mAP calculation vs 0.608 pycocotools with python3 test.py --weights weights/yolov3-spp.weights).

A couple points on training though:

AlexeyAB commented 5 years ago

@glenn-jocher Hi,

Can you provide short manual how to convert cfg/weigths -> pt and and convert back pt -> cfg/weigths to make it clear to most users? I want to link to it.

mAP in general is a terrible metric for real world usability, as it is optimized at extremely low confidence thresholds, creating a mess of FPs. F1 is more suitable I believe. See ultralytics/yolov3#188

Do you mean it is better to calculate many F1-scores for each threshold from 0.0 to 1.0 (for example) with step=0.01 and get the highest value? Since optimal threshold for Yolo is ~0.25 while optimal threshold for SSD/DSSD is ~0.5 - 0.8, so we can't compare these two models with the same Confidence-threshold.

glenn-jocher commented 5 years ago

@AlexeyAB yes I will create a short conversion function. There is already a nice pathway to convert weights/cfg to .pt, but we don't have an easy way to convert back to .weights/cfg yet though!!

About the mAP, it seems I get the best mAP@0.5 on COCO by testing at extremely low confidence thresholds, about conf_thres=0.001. But if you actually look at those pictures, the result is terrible, there are about 10 FPs for every 1 TP (about 0.10 precision). So in the ultralytics/yolov3 repo we test at conf_thres=0.001 but we detect at conf_thres=0.5.

This is an example using yolov3-tiny, from https://github.com/ultralytics/yolov3/issues/188. The top pictures, run at --conf-thres 0.001 produce much higher mAP using pycocotools. So I feel they've set a terrible metric, since now everyone is chasing mAP as some sort of end-all be-all metric for how well their object detector trained, when they are actually optimizing their system to produce junk like in the first examples:

ultralytics/yolov3 yolov3-tiny.weights darknet yolov3-tiny.weights
--conf-thres 0.001 --conf-thres 0.001
person predictions
zidane predictions
ultralytics/yolov3 yolov3-tiny.weights darknet yolov3-tiny.weights
--conf-thres 0.50 --conf-thres 0.50
person predictions
zidane predictions
AlexeyAB commented 5 years ago

@glenn-jocher

About the mAP, it seems I get the best mAP@0.5 on COCO by testing at extremely low confidence thresholds, about conf_thres=0.001. But if you actually look at those pictures, the result is terrible, there are about 10 FPs for every 1 TP (about 0.10 precision). So in the ultralytics/yolov3 repo we test at conf_thres=0.001 but we detect at conf_thres=0.5.

This is an example using yolov3-tiny, from https://github.com/ultralytics/yolov3/issues/188. The top pictures, run at --conf-thres 0.001 produce much higher mAP using pycocotools. So I feel they've set a terrible metric, since now everyone is chasing mAP as some sort of end-all be-all metric for how well their object detector trained, when they are actually optimizing their system to produce junk like in the first examples:

mAP is calculated for all possible thresholds.

So when you set the conf_thres=0.001 then you just set the lowest threshold, so mAP will be calculated from threshold=0.001 to 1.0 with some step.

Why should we take into account detections with very low and very high thresholds:

So to create the single rating of models we should use mAP that includes Precision and Recall for all possible thresholds.

That is why mAP is used in the most detection ratings/competitions: Pascal VOC, MS COCO, ImageNet...


Actually for MS COCO the mAP is calculated for 101 different thresholds, URLs at the bottom of the first message: https://github.com/AlexeyAB/darknet/issues/2746

We get 101 points on Precision-Recall curve, for Recall = 0.0 - 1.0 with step 0.01. So for each of these points there will be a difference threshold: https://github.com/AlexeyAB/darknet/blob/099b71d1de6b992ce8f9d7ff585c84efd0d4bf94/src/detector.c#L982-L1002

glenn-jocher commented 5 years ago

@AlexeyAB thanks for the excellent summary of mAP and why it's important across different applications! Very educational for everyone. It is true that you can tune your P/R ratio to suit your needs as move up the conf_threshand yes I see how the current mAP metric tests against all the various thresholds above the set value.

I've added a simple conversion function to export from both pytorch to darknet format and vice versa now. The process is very simple:

git clone https://github.com/ultralytics/yolov3 && cd yolov3

# darknet to pytorch
python3  -c "from models import *; convert('cfg/yolov3-spp.cfg', 'weights/yolov3-spp.weights')"
Success: converted 'weights/yolov3-spp.weights' to 'converted.pt'

# pytorch to darknet
python3  -c "from models import *; convert('cfg/yolov3-spp.cfg', 'weights/yolov3-spp.pt')"
Success: converted 'weights/yolov3-spp.pt' to 'converted.weights'
AlexeyAB commented 5 years ago

@glenn-jocher Thank you! I will add URL to Readme.

glenn-jocher commented 5 years ago

@AlexeyAB Great! You could link to our iDetection iOS app also if you want, it runs YOLOv3-SPP 320 realtime (about 15-20 FPS) on devices with the newest Apple A12 processor (iPhone Xs, Xr, etc.)

It has a 5 star rating and over 700 downloads in the last two months. The screenshots below are from a previous release at 416 inference, which reduces the framerate to about 11 FPS. We are working on introducing rectangular inference as well, which could theoretically boost the FPS by 40% on HD (16:9) aspect ratios vs square inference, adding pinch to zoom functionality like the native camera app, and a few other updates.

Older devices can run the app as well, but will suffer as the model year goes back. An iPhone 6s for example will run about 0.3 FPS. Apple has really been making leaps with their Neural Engine, which is at 5 TOPS now.

aditbhrgv commented 5 years ago

Hello @glenn-jocher & @AlexeyAB ,

I was trying to reproduce training & evaluation results on my custom dataset from Darknet C implementation from alexeyAB repo. I get worse performance results on ultralytics/yolov3 implementation. Could you please let me know how to reproduce my Darknet C results in Pytorch?

Training dataset: ~7800 images Test dataset: ~2560 images

Command which I ran to compute the metrics:

Darknet C implementation: ./build/darknet detector map cfg/hld.data cfg/yolov3-tiny_3l.cfg weights/yolov3-tiny_3l_20000.weights

Converted the .weights file from Darknet to .pt : python3 -c "from models import *; convert('cfg/yolov3-tiny_3l.cfg', 'weights/yolov3-tiny_3l_20000.weights')" to get converted.pt and then ran python test.py --cfg=cfg/yolov3-tiny_3l.cfg --data-cfg=cfg/obj.data --weights=converted.pt --img-size=608 --conf-thres=0.25 --batch-size=64

ultralytics/yolov3 Pytorch trained model implementation: python test.py --cfg=cfg/yolov3-tiny_3l.cfg --data-cfg=cfg/obj.data --weights=weights/best.pt--img-size=608 --conf-thres=0.25 --batch-size=64

Metrics Darknet trained model Converted model from Darknet .weights to .pth ultralytics/yolov3 trained model  
@0.25 conf-thresh        
         
Precision 0.78 59.5 0.45  
Recall 0.72 57.7 0.643  
F1 score 0.75 58.6 0.53  
MAP@0.5 0.7435 56.3 0.553  
         
         

Thanks

aditbhrgv commented 5 years ago

Also, when I tried to convert the Pytorch model to darknet .weights format , I get no detections in Darknet.

python3 -c "from models import *; convert('cfg/yolov3-tiny_3l.cfg', 'weights/best.pt')"

**calculation mAP (mean average precision)... 2560 detections_count = 0, unique_truth_count = 5009 class_id = 0, name = tl_pair, ap = 0.00% (TP = 0, FP = 0) class_id = 1, name = hl_pair, ap = 0.00% (TP = 0, FP = 0)

for thresh = 0.25, precision = -nan, recall = 0.00, F1-score = -nan for thresh = 0.25, TP = 0, FP = 0, FN = 5009, average IoU = 0.00 %

IoU threshold = 50 %, used Area-Under-Curve for each unique Recall mean average precision (mAP@0.50) = 0.000000, or 0.00 % Total Detection Time: 84.000000 Seconds

Set -points flag: -points 101 for MS COCO -points 11 for PascalVOC 2007 (uncomment difficult in voc.data) -points 0 (AUC) for ImageNet, PascalVOC 2010-2012, your custom dataset**

glenn-jocher commented 5 years ago

@aditbhrgv generally testing for mAP computation (to get the results shown in https://github.com/ultralytics/yolov3#map) should be done at extremely low conf_thres, such as the default value in test.py of 0.001.

We've not actually tried using converted models, so this is an interesting finding. What happens if you convert the official yolov3.pt model from https://drive.google.com/drive/folders/1uxgUBemJVw9wZsdpboYbzUN4bcRhsuAI to .weights format and test that?

aditbhrgv commented 5 years ago

@aditbhrgv generally testing for mAP computation (to get the results shown in https://github.com/ultralytics/yolov3#map) should be done at extremely low conf_thres, such as the default value in test.py of 0.001.

Actually, I am not interested in mAP, I just care about comparable P, R and F1 scores in both the implementations for a particular threshold. (0.25 in above example.). I wonder what could be implementation differences on my custom dataset which lead to the above results.

aditbhrgv commented 5 years ago

We've not actually tried using converted models, so this is an interesting finding. What happens if you convert the official yolov3.pt model from https://drive.google.com/drive/folders/1uxgUBemJVw9wZsdpboYbzUN4bcRhsuAI to .weights format and test that?

Hello @AlexeyAB , I tried to test this with official yolov3.pt and test on a single image(dog.jpg) and still can't get any detections Command:

/build/darknet detector test cfg/coco.data cfg/yolov3.cfg /home/Darknet2Pytorch/yolov3/converted.weights data/dog.jpg

There's definitely some problem in converting from Pytorch to Darknet weights.

THanks

glenn-jocher commented 5 years ago

@aditbhrgv ah buddy I think you are confusing the extensions and repositories a bit:

With the ultralytics/yolov3 repo the commands to detect the default images (using rectangular inference at 416 pixels :) with 1) original darknet yolov3-spp.weights and 2) darknet converted to pytorch converted.pt weights, and lastly 3) pytorch converted back to darknet as converted.weights. This round trip should fully verify the conversion functionality I believe:

# 1) original darknet weights ------------------------------------------------------------------
python3 detect.py --weights weights/yolov3-spp.weights  # original darknet weights
Namespace(cfg='cfg/yolov3-spp.cfg', conf_thres=0.5, data_cfg='data/coco.data', images='data/samples', img_size=416, nms_thres=0.5, weights='weights/yolov3-spp.weights')
Using CPU
image 1/2 data/samples/bus.jpg: 416x320 1 handbags, 3 persons, 1 buss, Done. (0.755s)
image 2/2 data/samples/zidane.jpg: 256x416 1 ties, 2 persons, Done. (0.607s)

# 2) converted to pytorch ---------------------------------------------------------------------
python3  -c "from models import *; convert('cfg/yolov3-spp.cfg', 'weights/yolov3-spp.weights')"
Success: converted 'weights/yolov3-spp.weights' to 'converted.pt'

python3 detect.py --weights converted.pt  # converted to pytorch
Namespace(cfg='cfg/yolov3-spp.cfg', conf_thres=0.5, data_cfg='data/coco.data', images='data/samples', img_size=416, nms_thres=0.5, weights='converted.pt')
Using CPU
image 1/2 data/samples/bus.jpg: 416x320 1 handbags, 3 persons, 1 buss, Done. (0.749s)
image 2/2 data/samples/zidane.jpg: 256x416 1 ties, 2 persons, Done. (0.588s)

# 3) converted back to darknet ---------------------------------------------------------------
python3  -c "from models import *; convert('cfg/yolov3-spp.cfg', 'converted.pt')"
Success: converted 'converted.pt' to 'converted.weights'

python3 detect.py --weights converted.weights  # converted back to darknet
Namespace(cfg='cfg/yolov3-spp.cfg', conf_thres=0.5, data_cfg='data/coco.data', images='data/samples', img_size=416, nms_thres=0.5, weights='converted.weights')
Using CPU
image 1/2 data/samples/bus.jpg: 416x320 1 handbags, 3 persons, 1 buss, Done. (0.749s)
image 2/2 data/samples/zidane.jpg: 256x416 1 ties, 2 persons, Done. (0.594s)
bus.jpg zidane.jpg
bus zidane
aditbhrgv commented 5 years ago

@glenn-jocher THanks for the clarification. I was thinking I could use the "converted.weights" from Pytorch in Darknet C implementation.

Just a last quick question, how can I reproduce the results on my custom dataset as Darknet C Implementation in Pytorch implementation ? (see table here https://github.com/AlexeyAB/darknet/issues/2914#issuecomment-487479712) .

I didn't use multi-scale training, neither in Darknet C nor in Pytorch implementation.

Metrics Darknet trained model Converted model from Darknet .weights to .pth ultralytics/yolov3 trained model
@0.25 conf-thresh

Precision 0.78 59.5 0.45
Recall 0.72 57.7 0.643
F1 score 0.75 58.6 0.53
MAP@0.5 0.7435 56.3 0.553

AlexeyAB commented 5 years ago

@aditbhrgv Hi,

Hello @glenn-jocher & @AlexeyAB ,

I was trying to reproduce training & evaluation results on my custom dataset from Darknet C implementation from alexeyAB repo. I get worse performance results on ultralytics/yolov3 implementation. Could you please let me know how to reproduce my Darknet C results in Pytorch?

Training dataset: ~7800 images Test dataset: ~2560 images

Command which I ran to compute the metrics:

Darknet C implementation: ./build/darknet detector map cfg/hld.data cfg/yolov3-tiny_3l.cfg weights/yolov3-tiny_3l_20000.weights

Can you attach yolov3-tiny_3l.cfg file? (rename it to cfg-file and attach).


Try to test official yolov3.pt on https://github.com/pjreddie/darknet instead of https://github.com/AlexeyAB/darknet does it work?

Hello @AlexeyAB , I tried to test this with official yolov3.pt and test on a single image(dog.jpg) and still can't get any detections Command:

/build/darknet detector test cfg/coco.data cfg/yolov3.cfg /home/Darknet2Pytorch/yolov3/converted.weights data/dog.jpg

There's definitely some problem in converting from Pytorch to Darknet weights.

THanks

AlexeyAB commented 5 years ago

@AlexeyAB Great! You could link to our iDetection iOS app also if you want, it runs YOLOv3-SPP 320 realtime (about 15-20 FPS) on devices with the newest Apple A12 processor (iPhone Xs, Xr, etc.)

It has a 5 star rating and over 700 downloads in the last two months. The screenshots below are from a previous release at 416 inference, which reduces the framerate to about 11 FPS. We are working on introducing rectangular inference as well, which could theoretically boost the FPS by 40% on HD (16:9) aspect ratios vs square inference, adding pinch to zoom functionality like the native camera app, and a few other updates.

Older devices can run the app as well, but will suffer as the model year goes back. An iPhone 6s for example will run about 0.3 FPS. Apple has really been making leaps with their Neural Engine, which is at 5 TOPS now.

@glenn-jocher Hi,

That's great! I will add URL.

We are working on introducing rectangular inference as well, which could theoretically boost the FPS by 40% on HD (16:9) aspect ratios vs square inference, adding pinch to zoom functionality like the native camera app, and a few other updates.

Do you mean that you currently uses square network size (320x320) and uses letter_box resizing with padding? https://github.com/AlexeyAB/darknet/issues/232#issuecomment-336955485

And you will add ability for rectangle network size 16:9 (320x192 or 576x320) and will use simple resize without padding?


Did you try to implement XNOR-net on ARM/ Apple A12 processor? https://github.com/AlexeyAB/darknet/issues/2365#issuecomment-462923756

glenn-jocher commented 5 years ago

@AlexeyAB we just got it done today!!! See https://github.com/ultralytics/yolov3/issues/232#issuecomment-487692744.

To answer your question yes, previously our app was running at 416x416 with letterboxing vertical 4k iPhone Xs video (the 4k video was resized to 234x416 and then padded/letterboxed to 416x416). This ran about 11 FPS. We reduced this to 320x320 to improve performance, and this ran at about 18 FPS. This is the current v4 app available for download today on the app store.

After our rectangular inference builds the app can now run YOLOv3-SPP at 30FPS 192x320, or 20FPS 256x416. We still letterbox/pad the short dimension to the nearest 32 multiple though. So for example the 4k video is resized to 234x416 (width x height), and then padded with 11 pixels on the left + 11 on the right to round out a multiple of 32: 256x416.

I don't know what XNOR-net is though. Here is an actual screenshot from today in Madrid, with a 1.15X zoom factor also (we enabled pinch-to-zoom functionality as well!! :)

aditbhrgv commented 5 years ago

Can you attach yolov3-tiny_3l.cfg file? (rename it to cfg-file and attach).

Hello @AlexeyAB Please find the attached cfg file. yolov3-tiny_3l.cfg.txt

gwestner94 commented 5 years ago

Hi @glenn-jocher, I am having the same issue as @aditbhrgv when making the round trip from: alexey/darknet -> pytorch -> alexey/darknet using the supplied pytorch yolov3 model as well as custom trained pytorch yolov3 models.

After the conversion nothing is detected.

I figured out that after changing the header information in the weights file using a tool like https://linux.die.net/man/1/vbindiff the output in alexey/darknet for a default tiny-yolov3 (https://pjreddie.com/media/files/yolov3-tiny.weights) the correct output is reproduced on dog.jpg.

But when converting a custom model I suffer a big accuracy loss (scores are reduced almost by 0.5).

The weird thing is that, when using my custom model or default tiny yolov3 after conversion on the pjreddie version of darknet, the network produces the right output after the vbindiff change.

@AlexeyAB is there a difference, that you are aware of, between the pjreddie reposiotry and yours that could cause such a mismatch?

@glenn-jocher Is there some reason why you don't preserve the header information after conversion? Is your conversion tested on the AlexeyAB/darknet version?

Thank you very much, this would clear up a lot for me

glenn-jocher commented 5 years ago

@gwestner94 we can test out the conversion mAPs. The commands (and saved outputs) are here. All 3 results are identical, performing the mAP calculation using ultralytics/yolov3. The original yolov3-spp.weights was downloaded from https://pjreddie.com/media/files/yolov3-spp.weights.

This mAP round-trip should be reproducible in our Google Colab Notebook.

If the headers are different, perhaps the header may play a role when using this repo. Feel free to submit a PR for header inclusion over at ultralytics/yolov3 if you'd like.

git clone https://github.com/ultralytics/yolov3
cd yolov3

# 1) original darknet weights ------------------------------------------------------------------
python3 test.py --weights weights/yolov3-spp.weights --save-json
#  Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.566

# 2) converted to pytorch ---------------------------------------------------------------------
python3  -c "from models import *; convert('cfg/yolov3-spp.cfg', 'weights/yolov3-spp.weights')"
# Success: converted 'weights/yolov3-spp.weights' to 'converted.pt'
python3 test.py --weights converted.pt --save-json 
#  Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.566

# 3) converted back to darknet ---------------------------------------------------------------
python3  -c "from models import *; convert('cfg/yolov3-spp.cfg', 'converted.pt')"
# Success: converted 'converted.pt' to 'converted.weights'
python3 test.py --weights converted.weights --save-json 
#  Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.566

EDIT 1: @gwestner94 after re-reading your question a second test would be to perform the same round trip using an AlexeyAB/darknet trained network. I just so happen to have one of these. I can try the round trip again with it later.

EDIT 2: Round trip successfully performed on custom dataset trained on AlexeyAB/darknet.

# 1) original darknet weights ------------------------------------------------------------------
python3 test.py --weights ../darknet/backup/yolov3-spp-sm2-1cls_5000.weights --cfg cfg/yolov3-spp-sm2-1cls.cfg --data ../supermarket2/supermarket2.data
#               Class    Images   Targets         P         R       mAP        F1
# Computing mAP: 100%|██████████████████████████████████| 2/2 [00:02<00:00,  1.72s/it]
#                 all        25       479     0.486     0.971     0.868     0.648

# 2) converted to pytorch ---------------------------------------------------------------------
python3  -c "from models import *; convert('cfg/yolov3-spp-sm2-1cls.cfg', '../darknet/backup/yolov3-spp-sm2-1cls_5000.weights')"
# Success: converted '../darknet/backup/yolov3-spp-sm2-1cls_5000.weights' to 'converted.pt'
python3 test.py --weights converted.pt --cfg cfg/yolov3-spp-sm2-1cls.cfg --data ../supermarket2/supermarket2.data
#               Class    Images   Targets         P         R       mAP        F1
# Computing mAP: 100%|██████████████████████████████████| 2/2 [00:02<00:00,  1.72s/it]
#                 all        25       479     0.486     0.971     0.868     0.648

# 3) converted back to darknet ---------------------------------------------------------------
python3  -c "from models import *; convert('cfg/yolov3-spp-sm2-1cls.cfg', 'converted.pt')"
# Success: converted 'converted.pt' to 'converted.weights'
python3 test.py --weights converted.weights --cfg cfg/yolov3-spp-sm2-1cls.cfg --data ../supermarket2/supermarket2.data
#               Class    Images   Targets         P         R       mAP        F1
# Computing mAP: 100%|██████████████████████████████████| 2/2 [00:02<00:00,  1.65s/it]
#                 all        25       479     0.486     0.971     0.868     0.648
gwestner94 commented 5 years ago

Thank you for your feedback! I will look into the header specifics and give you an update on your repository when I find a solution. It looks like the problem can be solved by correct header information.

Sudhakar17 commented 5 years ago

I tested the yolo-v3 model using COCO-Val data in both darknet and pytorch(ultralytics).

In Pytorch framework:

1.Yolo-v3.weights (original from darknet) -- 54.2 % mAP 2.Yolo-v3_converted.pt (converted using ultralytics code) -- 54.2% 3.Yolo-v3_converted.weights(converted back to original weights) -- 54.2%

In Darknet framework:

Yolo-v3.weights (original) -- 54.37 % Yolo_v3_converted.weights(darknet-->pytorch-->darknet weights) -- 0%

save_weights method from ultralytics_code:

`def save_weights(self, path='model.weights', cutoff=-1):

Converts a PyTorch model to Darket format (.pt to .weights)

# Note: Does not work if model.fuse() is applied
**with open(path, 'wb') as f:
    self.header_info[3] = self.seen  # number of images seen during training
    self.header_info.tofile(f)**

    # Iterate through layers
    for i, (module_def, module) in enumerate(zip(self.module_defs[:cutoff], self.module_list[:cutoff])):
        if module_def['type'] == 'convolutional':
            conv_layer = module[0]
            # If batch norm, load bn first
            if module_def['batch_normalize']:
                bn_layer = module[1]
                bn_layer.bias.data.cpu().numpy().tofile(f)
                bn_layer.weight.data.cpu().numpy().tofile(f)
                bn_layer.running_mean.data.cpu().numpy().tofile(f)
                bn_layer.running_var.data.cpu().numpy().tofile(f)
            # Load conv bias
            else:
                conv_layer.bias.data.cpu().numpy().tofile(f)
            # Load conv weights
            conv_layer.weight.data.cpu().numpy().tofile(f)`

It uses the header info to store it in the weights file. Can you please tell us that what went wrong in the .weights file conversion? @glenn-jocher

what do u mean "correct header information"? @gwestner94

glenn-jocher commented 5 years ago

@Sudhakar17 I don't believe anything went wrong with the weights conversion, as you can see from your own pytorch framework experiment. I myself don't have knowledge of how the headers are used in this AlexeyAB/darknet repository, that would be a question for @AlexeyAB. We do not use them at all in https://github.com/ultralytics/yolov3.

AlexeyAB commented 5 years ago

@Sudhakar17 Hi,

Can you share (f.e. via Google-disk) 4 files - I will check the difference:

  1. yolo-v3.weights (original from darknet)
  2. yolo-v3_converted.pt (converted using ultralytics code)
  3. yolo-v3_converted.weights
  4. yolov3.cfg (to be sure that we use exactly the same model)
glenn-jocher commented 5 years ago

@AlexeyAB thanks! We presently create a header 5 values long of int32s, and then write the number of images seen in index 3, everything else is left as zeros. Are there any other variables that should be written to this header when saving to a *.weights file? Is each value 32 bits?

# Needed to write header when saving *.weights
self.header_info = np.zeros(5, dtype=np.int32)  # First five are header values
self.header_info[3] = seen  # number of images seen during training
AlexeyAB commented 5 years ago

@glenn-jocher


You should use these values major=0, minor=2, revision=5. Because old version 0.1.0 used uint32_t for seen instead of uint64_t and the header was shorter by 4 bytes.

Sudhakar17 commented 5 years ago

@AlexeyAB I am traveling at the moment. I used original yolo-v3 model and the same cfg file. mAP values will be the different one since I didn't update my local darknet repository. I update my repository and rerun the converted model later. Is this header info used anywhere for calculating mAP? @glenn-jocher

Sudhakar17 commented 5 years ago

I updated the darknet repository and run the yolo-v3_converted.weight. It's not working. Any new updates? @AlexeyAB @glenn-jocher

AlexeyAB commented 5 years ago

@glenn-jocher Hi, did you fix header (version) in your conversion script? https://github.com/AlexeyAB/darknet/issues/2914#issuecomment-496675346

glenn-jocher commented 5 years ago

@AlexeyAB @Sudhakar17 I just fixed this now in https://github.com/ultralytics/yolov3/commit/d7a28bd9f74d922216e06de3dde5f981b3002bd4

@Sudhakar17 you should now be able to run pytorch exported models in darknet.