Open Hishok opened 4 years ago
Is training stopping or moving on?
Also you can try to comment these 3 lines: https://github.com/AlexeyAB/darknet/blob/fdb1841eb1699cf24a193787c2da7b2d9bd10070/src/network_kernels.cu#L390-L392
Training stopped at that point. The error says attention_img does this have something to do with the attention flag? I am running this in google colab. @AlexeyAB
I have used that in the cfg-file but I am getting the error. Would I have to comment the lines out you mentioned ? @AlexeyAB
Use attention=0
in cfg-file
@AlexeyAB Getting a different error now when using attention=0
in the config file:
Did you comment these 3 lines? https://github.com/AlexeyAB/darknet/blob/fdb1841eb1699cf24a193787c2da7b2d9bd10070/src/network_kernels.cu#L390-L392
I tried updating it by forking darknet, but when I follow your Google Colab tutorial I can't get it to work in terms of creating the darknet environment @AlexeyAB
Can you give a link to colab with error?
The link to colab: https://colab.research.google.com/drive/1sA1o2HzV-_10bQqN8t64wlfGSv2Mi0pB?usp=sharing
The error I am getting is :
8 errors detected in the compilation of "/tmp/tmpxft_0000033f_00000000-7_network_kernels.compute_70.cpp1.ii".
Makefile:168: recipe for target 'obj/network_kernels.o' failed
make: *** [obj/network_kernels.o] Error 1
You can see this in cell 4.
In cell 21 I get the error: /bin/bash: ./darknet: No such file or directory
I must be doing something wrong because I managed to get a number of YOLOv3 and v4 models working without forking darknet.
@AlexeyAB
Hi @AlexeyAB do you know where this is going wrong?
Everything works well. I just opened your colab link and clicked Run All, and I didn't get any errors: https://colab.research.google.com/gist/AlexeyAB/5f1434d054d5d704806461612bc8e93c/yolov4-sat.ipynb
@AlexeyAB apologies I sent the wrong colab link. This is the correct one: https://colab.research.google.com/drive/1sA1o2HzV-_10bQqN8t64wlfGSv2Mi0pB?usp=sharing
The first cell shows where I forked darknet and I can't create the darknet environment.
This is the same link. Open your link -> press Runtime -> press Restart and run all
@AlexeyAB Sorry I sent you the exact same link. The actual link is https://colab.research.google.com/drive/1CDN0nBRkraLknTmBQX2i7C5mv1PkK4HX?usp=sharing
In cell 1 when I change it to
# clone darknet repo
!git clone https://github.com/Hishok/darknet
to pick up the changes of the 3 lines of code I can't create darknet.
If I use:
# clone darknet repo
!git clone https://github.com/AlexeyAB/darknet
I can create darknet however I get the errors mentioned above in my original post as it does not have the 3 lines of code that you mentioned.
Remove your repo, fork it from https://github.com/AlexeyAB/darknet and comment these 3 lines.
Thank you @AlexeyAB , it looks like it is working ! Training hasn't stopped.
Hi @AlexeyAB I have trained using SAT on the BDD100K dataset, however after 9000 iterations the mAP score falls to 0. I have been running the colab cell using %%capture
otherwise my laptop crashes.
Below is how the mAP score varies over the iterations:
The CFG file :
[net]
# Testing
#batch=1
#subdivisions=1
# Training
batch=64
subdivisions=32
width=416
height=416
channels=3
momentum=0.949
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
adversarial_lr= 1
#attention=1
learning_rate=0.001
burn_in=1000
max_batches = 20000
policy=steps
steps=16000,18000
scales=.1,.1
#cutmix=1
mosaic=1
@lukeai
Try to use lower adversarial_lr=0.1
or 0.01
Thank you @AlexeyAB . I have tried both and at around 4000 and 5000 iterations the AP and mAP goes to 0 and the precision is -nan. Would you recommend going lower than 0.01?
I am running YOLOv4 using adversarial training however I keep getting the error 'cannot open display' and I am using the -dont_show flag.
I have tried setting adversarial_lr to 1 and 0.05 but I am still getting the same error.
Below is a small section config file that I am using and where I have put adversarial_lr. I managed to get YOLOv4 to work without adversarial_lr but now I am getting this error. I am using the BDD100K dataset.
!./darknet detector train data/obj.data cfg/yolov4-obj.cfg yolov4.conv.137 -dont_show -map
The line above is what I am using to train.
Thank you in advance!
@AlexeyAB @lukeAI
5117