Open Deadmin1 opened 5 years ago
@Deadmin1 Hi,
Such issue can be if you use width= height= that are not multiple of 32.
Heay @AlexeyAB thanks again for your reply.
What date of Darknet code do you use?
Can you attach your cfg-file to the your message
[net] # Testing batch=64 subdivisions=4 # Training # batch=64 # subdivisions=2 width=416 height=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1
learning_rate=0.001 burn_in=1000 max_batches = 500200 policy=steps steps=400000,450000 scales=.1,.1
[convolutional] batch_normalize=1 filters=16 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky
[maxpool] size=2 stride=1
[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky
###########
[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky
[convolutional] size=1 stride=1 pad=1 filters=18 activation=linear
[yolo] mask = 3,4,5 anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319 classes=1 num=6 jitter=.3 ignore_thresh = .7 truth_thresh = 1 random=1
[route] layers = -4
[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky
[upsample] stride=2
[route] layers = -1, 8
[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky
[convolutional] size=1 stride=1 pad=1 filters=18 activation=linear
[yolo] mask = 0,1,2 anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319 classes=1 num=6 jitter=.3 ignore_thresh = .7 truth_thresh = 1 random=1 max=200
Did you change anything in the source code?
What command do you use for testing?
What mAP can you get?
But like i said i train it with the same .cfg the same dataset on my computer and it works fine.
Thanks in advance
What mAP can you get? 0.03% on Trainingsmachine (with the problem) 83 % on my computer (working fine)
This is very strange.
Makefile
do you use on the training machine and your computer?Download cuDNN v7.4.2 (Dec 14, 2018), for CUDA 10.0
instead of Download cuDNN v7.4.2 (Dec 14, 2018), for CUDA 9.2
if you use CUDA 10.0
https://developer.nvidia.com/rdp/cudnn-download
Sometimes, you installed cuDNN for CUDA 9.2
and than installed cuDNN for CUDA 10.0
but paths still referes to the old cuDNN for CUDA 9.2
.
Heay Alexey, i have to ask you and the community again. Now i have acces to a powerful machine (4x gtx1080) and started training. My configuration is identical to my computer. I have problems with the bounding boxes. They are shifted. I tested with tiny, tiny_3l and both with higher resolutions but its the same. The testing is done on my computer. Compiled darknet again and its still the same. Do you know where this comes from? Trainingsmachine: 4x GTX 1080 cuda 10 cudnn
Training multiple versions at the same time on different gpu's with uniqe folders for dataset, cfg, .data .names
But this still exists when i train just 1 model on 1 gpu
Edits: 1: Done: -Downloaded and compiled an the trainingsmachine from scratch. -Trained again with tiny-yolo -checked dataset with yolo-mark (looks good) The problem still exists
2: Trained a tiny-yolo on my machine and tested it. No shifting here. But i used the same trainingset and config file like on the more powerful machine. Where could be the problem here i cant understand it.
3: Compiled darknet on trainingsmachine without cudnn and trained and looks still the same