A Simple and Efficient Network for Small Target Detection

mrhosseini commented 5 years ago

Hi,

This paper proposes a new network configuration for small target detection and claims that it has a performance near YoloV3 while a speed near YoloV3-Tiny. The main idea is to use dilated and 1x1 convolutions.

I tried to implement the network using this repo but in training always get NaN for loss and avg loss.

Here is the configuration that I used for single class detection:

[net]
# Testing
#batch=1
#subdivisions=1
# Training
batch=64
subdivisions=8
width=512
height=512
channels=1
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 2.0
hue=0

learning_rate=0.001
burn_in=1000
max_batches = 8000
policy=steps
steps=6400,7200
scales=.1,.1

[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=16
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
dilation=2

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[route]
layers=-1, -3

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
dilation=4

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[route]
layers=-1, -3

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[route]
layers=9, 13

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[route]
layers=-1, -3

[convolutional]
batch_normalize=1
filters=128
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[route]
layers=8

[convolutional]
batch_normalize=1
filters=16
size=1
stride=1
pad=1
activation=leaky

[reorg3d]
stride=2

[route]
layers=25, 28

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=18
activation=leaky

and this is the proposed network in the paper:

Any advice for solving the problem?

AlexeyAB commented 5 years ago

I don't see [yolo] layer in you cfg-file. Can you rename to txt-file and attach whole your cfg-file?

mrhosseini commented 5 years ago

The cfg-file: network.cfg.txt

I don't see [yolo] layer in you cfg-file.

The proposed network in the paper does not have any [yolo] or [cost] layers.

Based on the yolov3-tiny.cfg file, I changed the activation function of last layer to linear and added a [yolo] layer after it (network_with_yolo.cfg.txt). Now it can be trained but the performance is weaker than YoloV3-Tiny. No NaN for loss and avg loss values and these values oscillate in a much larger range compared to the YoloV3-Tiny.

AlexeyAB commented 5 years ago

The proposed network in the paper does not have any [yolo] or [cost] layers.

The proposed network in the paper has [yolo] layer

mrhosseini commented 5 years ago

I thought that in the table, left column is the architecture of authors proposed network and right column is the architecture of Tiny YoloV3 and each column presents a separate independent architecture. Therefore, the [yolo] layer you mentioned, is in the Tiny YoloV3 not the proposed network.

AlexeyAB commented 5 years ago

Yes, sure, you are right ) But still no detection network can work without a detection head: [yolo], SSD, Faster RCNN, ...

mrhosseini commented 5 years ago

Thanks. So there may be a mistake in the table. As I mentioned before, adding a [yolo] layer after the last convolution layer did not give any interesting results:

Based on the yolov3-tiny.cfg file, I changed the activation function of last layer to linear and added a [yolo] layer after it (network_with_yolo.cfg.txt). Now it can be trained but the performance is weaker than YoloV3-Tiny. No NaN for loss and avg loss values and these values oscillate in a much larger range compared to the YoloV3-Tiny.

Despite the [yolo] layer, is the configuration in network_with_yolo.cfg.txt conforming with the proposed network in the paper? I used [route] layer for Concatenation layers and [reorg3d] layer for the Passthrough layer.

AlexeyAB commented 5 years ago

Now it can be trained but the performance is weaker than YoloV3-Tiny.

What dataset do you use?
How many training images?
What is the average size of objects after resizing images to the network size 512x512?
What mAP did you get in both cases?
Can you show chart.png with Loss & mAP for both network_with_yolo.cfg.txt and yolov3-tiny.cfg ?

Yes, it seems network_with_yolo.cfg.txt conforming with the proposed network in the paper

I used [route] layer for Concatenation layers and [reorg3d] layer for the Passthrough layer.

Thats right.

Try to use in the [yolo] layer

filters=36
...

mask = 0,1,2,3,4,5
anchors = 10,14,  23,27,  37,58,  81,82,  135,169,  344,319

mrhosseini commented 5 years ago

What dataset do you use?

A custom dataset. I have not tested the datasets used in the paper.

How many training images?

1734 images.

What is the average size of objects after resizing images to the network size 512x512?

About 30x30.

What mAP did you get in both cases?

Can you show chart.png with Loss & mAP for both network_with_yolo.cfg.txt and yolov3-tiny.cfg ?

chart.png for yolov3-tiny.cfg.txt:

yolov3-tiny

chart.png for network_with_yolo.cfg.txt:

network_with_yolo

Note that:

The validation set used for mAP calculation is different from the training set.
Anchors are calculated for the dataset using darknet detector calc_anchors.
Network image size for Tiny YoloV3 is 416x416.

Try to use in the [yolo] layer

filters=36 ...

mask = 0,1,2,3,4,5

Without these changes the mAP was lower with avg loss swinging in a larger range.

We have a separate test set. Here are the results of darknet detector map:

With best weights using yolov3-tiny.cfg.txt:

 calculation mAP (mean average precision)...
380
 detections_count = 573, unique_truth_count = 108  
class_id = 0, name = cls, ap = 74.57%        (TP = 83, FP = 44) 

 for conf_thresh = 0.25, precision = 0.65, recall = 0.77, F1-score = 0.71 
 for conf_thresh = 0.25, TP = 83, FP = 44, FN = 25, average IoU = 45.68 % 

 IoU threshold = 40 %, used Area-Under-Curve for each unique Recall 
 mean average precision (mAP@0.40) = 0.745731, or 74.57 % 
Total Detection Time: 0.000000 Seconds

With best weights using network_with_yolo.cfg.txt:

 calculation mAP (mean average precision)...
380
 detections_count = 576, unique_truth_count = 108  
class_id = 0, name = cls, ap = 67.20%        (TP = 82, FP = 67) 

 for conf_thresh = 0.25, precision = 0.55, recall = 0.76, F1-score = 0.64 
 for conf_thresh = 0.25, TP = 82, FP = 67, FN = 26, average IoU = 40.94 % 

 IoU threshold = 40 %, used Area-Under-Curve for each unique Recall 
 mean average precision (mAP@0.40) = 0.671979, or 67.20 % 
Total Detection Time: 2.000000 Seconds

AlexeyAB commented 5 years ago

Are mAPs on the charts for Training or Validation dataset?

mrhosseini commented 5 years ago

Are mAPs on the charts for Training or Validation dataset?

Validation

AlexeyAB commented 5 years ago

Why on the chart you get 99.9% but for ./darknet detector map ... you get 67.20% for network_with_yolo.cfg.txt ?

mrhosseini commented 5 years ago

Why on the chart you get 99.9% but for ./darknet detector map ... you get 67.20% for network_with_yolo.cfg.txt ?

I used a separate test set for darknet detector map, which is different from the validation set used in training.

AlexeyAB commented 5 years ago

Did you get Training/Valid/Test dataset by randomly uniform dividing single dataset to 80%/10%/10%?

mrhosseini commented 5 years ago

Did you get Training/Valid/Test dataset by randomly uniform dividing single dataset to 80%/10%/10%?

Train and valid sets are selected randomly from a single dataset with 1734 images for train and 530 images for valid . But the test set is an independent set.

AlexeyAB commented 5 years ago

So may be this is the reason. Your train for one objects, but test for others.

mrhosseini commented 5 years ago

So may be this is the reason. Your train for one objects, but test for others.

Yes, you are right

sctrueew commented 5 years ago

@mrhosseini Hi,

When I using network_with_yolo.cfg I’m faced with this error.

cuDNN status Error in: file: ....\src\convolutional_layer.c : get_workspace_size16() cuDNN Error: CUDNN_STATUS_NOT_SUPPORTED

I have 18 classes and I just changed: filters=138 and classes to 18.

mrhosseini commented 5 years ago

When I using network_with_yolo.cfg I’m faced with this error.

cuDNN status Error in: file: ....\src\convolutional_layer.c : get_workspace_size16() cuDNN Error: CUDNN_STATUS_NOT_SUPPORTED

I have 18 classes and I just changed: filters=138 and classes to 18.

@zpmmehrdad Hi, Unfortunately I’m not familiar with cuDNN. May be @AlexeyAB can help you.

sctrueew commented 5 years ago

@mrhosseini Hi,

Thanks, What CUDNN and CUDA version are you using?

AlexeyAB commented 5 years ago

@zpmmehrdad

What GPU do you use?
What command do you use?
Can you show output of commands:
```
nvcc --version
nvidia-smi
```

sctrueew commented 5 years ago

@AlexeyAB Hi,

I'm using OS: Win10, command: darknet.exe detector train a.obj network_with_yolo.cfg -map

output:

compute_capability = 610, cudnn_half = 0 layer filters size/strd(dil) input output 0 conv 16 3 x 3/ 1 512 x 512 x 1 -> 512 x 512 x 16 0.075 BF 1 conv 32 3 x 3/ 2 512 x 512 x 16 -> 256 x 256 x 32 0.604 BF 2 conv 16 1 x 1/ 1 256 x 256 x 32 -> 256 x 256 x 16 0.067 BF 3 conv 32 3 x 3/ 1 256 x 256 x 16 -> 256 x 256 x 32 0.604 BF 4 conv 64 3 x 3/ 2 256 x 256 x 32 -> 128 x 128 x 64 0.604 BF 5 conv 32 1 x 1/ 1 128 x 128 x 64 -> 128 x 128 x 32 0.067 BF 6 conv 64 3 x 3/ 1 128 x 128 x 32 -> 128 x 128 x 64 0.604 BF 7 conv 32 1 x 1/ 1 128 x 128 x 64 -> 128 x 128 x 32 0.067 BF 8 conv 64 3 x 3/ 1 128 x 128 x 32 -> 128 x 128 x 64 0.604 BF 9 conv 32 1 x 1/ 1 128 x 128 x 64 -> 128 x 128 x 32 0.067 BF 10 cuDNN status Error in: file: ....\src\convolutional_layer.c : get_workspace_size16() : line: 157 : build time: Oct 22 2019 - 09:30:52 cuDNN Error: CUDNN_STATUS_NOT_SUPPORTED

cuDNN Error: CUDNN_STATUS_NOT_SUPPORTED: No error Assertion failed: 0, file ....\src\utils.c, line 293

AlexeyAB commented 4 years ago

@zpmmehrdad What GPU do you use?

sctrueew commented 4 years ago

@zpmmehrdad What GPU do you use?

@AlexeyAB Hi, GTX 1080 ti

sctrueew commented 4 years ago

@AlexeyAB Hi, I found the problem. I updated the CUDA version from 9.1 to 10.0 and it's work.

leiyaohui commented 4 years ago

@mrhosseini Hello? I'm also studying this field recently. Are you running on windows? If so, can you send me a copy of your compiled Darknet and pack it for me? I encountered a lot of errors in compiling. My email is 1373890292@qq.com，I look forward to your reply.

mrhosseini commented 4 years ago

Hi @leiyaohui , unfortunately I use Ubuntu. Try one of the methods here. You may open a new issue if encountered with errors.

leiyaohui commented 4 years ago

Did you write the expansion convolution or did it come with Darknet itself? ---原始邮件--- 发件人:"mrhosseini"notifications@github.com; 发送时间:2019年12月9日(星期一) 下午5:57 收件人:"AlexeyAB/darknet"darknet@noreply.github.com; 抄送人:"leiyaohui"1373890292@qq.com;"Mention"mention@noreply.github.com; 主题:Re: [AlexeyAB/darknet] A Simple and Efficient Network for SmallTarget Detection (#4213)

Hi @leiyaohui , unfortunately I use Ubuntu. Try one of the methods here. You may open a new issue if encountered with errors.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

mrhosseini commented 4 years ago

Did you write the expansion convolution or did it come with Darknet itself?

The dilated convolution is implemented in this repository. You can use this configuration file for the proposed network of the paper which mentioned above.

AlexeyAB / darknet

A Simple and Efficient Network for Small Target Detection #4213