AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.75k stars 7.96k forks source link

shifted bounding boxes on multiple versions #2159

Open Deadmin1 opened 5 years ago

Deadmin1 commented 5 years ago

Heay Alexey, i have to ask you and the community again. Now i have acces to a powerful machine (4x gtx1080) and started training. My configuration is identical to my computer. I have problems with the bounding boxes. They are shifted. I tested with tiny, tiny_3l and both with higher resolutions but its the same. The testing is done on my computer. Compiled darknet again and its still the same. Do you know where this comes from? Trainingsmachine: 4x GTX 1080 cuda 10 cudnn

Training multiple versions at the same time on different gpu's with uniqe folders for dataset, cfg, .data .names

But this still exists when i train just 1 model on 1 gpu

predictions

Edits: 1: Done: -Downloaded and compiled an the trainingsmachine from scratch. -Trained again with tiny-yolo -checked dataset with yolo-mark (looks good) The problem still exists

2: Trained a tiny-yolo on my machine and tested it. No shifting here. But i used the same trainingset and config file like on the more powerful machine. Where could be the problem here i cant understand it.

3: Compiled darknet on trainingsmachine without cudnn and trained and looks still the same

AlexeyAB commented 5 years ago

@Deadmin1 Hi,

Such issue can be if you use width= height= that are not multiple of 32.

Deadmin1 commented 5 years ago

Heay @AlexeyAB thanks again for your reply.

  1. What date of Darknet code do you use?

    • Tried with 1 week old, and today i downloaded and recompiled it again. Same Problem on the Trainingsmachine. On my computer it works
  2. Can you attach your cfg-file to the your message

    tiny_std.cfg

    [net] # Testing batch=64 subdivisions=4 # Training # batch=64 # subdivisions=2 width=416 height=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1

learning_rate=0.001 burn_in=1000 max_batches = 500200 policy=steps steps=400000,450000 scales=.1,.1

[convolutional] batch_normalize=1 filters=16 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=1

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

###########

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=18 activation=linear

[yolo] mask = 3,4,5 anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319 classes=1 num=6 jitter=.3 ignore_thresh = .7 truth_thresh = 1 random=1

[route] layers = -4

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[upsample] stride=2

[route] layers = -1, 8

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=18 activation=linear

[yolo] mask = 0,1,2 anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319 classes=1 num=6 jitter=.3 ignore_thresh = .7 truth_thresh = 1 random=1 max=200

  1. Did you change anything in the source code?

    • Did not change anything.
  2. What command do you use for testing?

    • ./darknet detector demo cfg/cone.data cfg/tiny_std.cfg backup/tiny_std_last.weight data/test.mp4
    • ./darknet detector test cfg/cone.data cfg/tiny_std.cfg backup/tiny_std_last.weight data/cones.jpg
  3. What mAP can you get?

    • 0.03% on Trainingsmachine (with the problem)
    • 83 % on my computer (working fine)

But like i said i train it with the same .cfg the same dataset on my computer and it works fine.

Thanks in advance

AlexeyAB commented 5 years ago

What mAP can you get? 0.03% on Trainingsmachine (with the problem) 83 % on my computer (working fine)

This is very strange.