Open Chaimmoon opened 4 years ago
@Chaimmoon
Thank you for point out the typos. It should be 800,000, which is same in the cfg.
I have only implemented CSPDensenet and CSPDarknet with Pytorch.
Following is the results of (CSP)Densenet-{121, 169, 201, 264} with PyTorch.
and my PyTorch implemented darknet53 and cspdarknet53 get 76.3/92.9 and 76.9/93.3 top-1/top-5 accuracy with 224x224 input resolution, respectively.
You should make sure the BN layers and activation functions are same as provided cfg file.
@Chaimmoon
this is my PyTorch implementation of CSPDarknet. darknet.py.txt
I borrow some functions from mmdetection and mmcv.
the main difference between CSPDarknet and CSPResNe(X)t is CSPDarknet use darknet_layer
and CSPResNe(X)t use resne(x)t_layer
.
x = down_layer(x)
x1, x2 = x.chunk(2, dim=1)
x2 = darknet_layer(x2)
x = torch.cat([x1,x2], 1)
x = tran_layer(x)
@Chaimmoon
Thank you for point out the typos. It should be 800,000, which is same in the cfg.
I have only implemented CSPDensenet and CSPDarknet with Pytorch. Following is the results of (CSP)Densenet-{121, 169, 201, 264} with PyTorch.
and my PyTorch implemented darknet53 and cspdarknet53 get 76.3/92.9 and 76.9/93.3 top-1/top-5 accuracy with 224x224 input resolution, respectively.
You should make sure the BN layers and activation functions are same as provided cfg file.
@WongKinYiu
Thanks for your reply!
I implemented the ResNet10, ResNet50 and ResNeXt50. The results are not quite good as your paper said... (Besides, can you provide the cfg file for the ResNet10_CSP? The architectures for ResNet10 and 50 are quite different.)
As for the BN, it should be torch.nn.BatchNorm2d, and the activation function should be torch.nn.LeakyReLU, right?
Can you provide your PyTorch code? Thanks
Mu
Best, Mu
@Chaimmoon
My PyTorch code is posted on https://github.com/WongKinYiu/CrossStagePartialNetworks/issues/24#issuecomment-623125410.
I am sorry about that I can not release my lightweight models due to some issues. You can try to follow the rule of ResNet50->CSPResNet50 to modify ResNet10->CSPResNet10.
@WongKinYiu
Thanks for your work!
I have a question about [sam]
layers
in https://github.com/AlexeyAB/darknet/issues/3708#issuecomment-518618199
SAM
module consists of one [convolutional]
layer and one sam
layer like following
while in https://github.com/AlexeyAB/darknet/issues/5355#issuecomment-619859913
SAM
module consists of two [convolutional]
layers and one sam
layer ,not one [convolutional]
layer, like following
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=mish
[convolutional]
batch_normalize=1
filters=512
size=1
stride=1
pad=1
activation=logistic
[sam]
from=-2
what's more,in https://github.com/AlexeyAB/darknet/issues/5355#issuecomment-619859913
the [convolutional]
layer in front of the sam
layer has pad=1
,while in https://github.com/AlexeyAB/darknet/issues/3708#issuecomment-518618199, the [convolutional]
layer in front of the sam
layer dose not have pad=1
,
I want to know which [sam] layer
is correct?
@nyj-ocean Hello,
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=512
activation=mish
[convolutional] batch_normalize=1 filters=512 size=1 stride=1 pad=1 activation=logistic
[sam] from=-2
which is sam module.
![image](https://user-images.githubusercontent.com/12152972/81059524-23695180-8f03-11ea-9498-d17c8277739a.png)
2. In https://github.com/AlexeyAB/darknet/issues/3708#issuecomment-518618199
![](https://user-images.githubusercontent.com/55009815/81057829-c6b86780-8eff-11ea-8618-da086c5815cf.png)
which is the usage of sam layer.
![image](https://user-images.githubusercontent.com/12152972/81059834-d639af80-8f03-11ea-8cc3-3c7d7d6ca096.png)
3. `pad=1` and `pad=0` are same when convolutional filter size is `1x1`.
@WongKinYiu
Thanks for your reply
I want to add the SAM
module to YOLOv3
,.
can you help me check whether the following cfg
is right?
@nyj-ocean
the latest [sam] block
seems at different layer when compare with 1st and 2nd [sam] block
in your cfg file.
and in my previous experiments, i used sam layer as: SAM-to-yolov3.cfg.txt
@WongKinYiu
Thanks for your help!
I noticed that the yolov4
paper has mentioned a modified SAM
block.
Is the SAM block
in your provided SAM-to-yolov3.cfg.txt
https://github.com/WongKinYiu/CrossStagePartialNetworks/issues/24#issuecomment-624093575 equal to the modified SAM
block mentioned in yolov4
?
yes, it is same. and the comparison of w/w\o sam is posted on 1st table of readme in this repo.
@WongKinYiu thanks for your help!!!
@WongKinYiu
Hi, I have checked the network structure and number of parameters in my CSPResNet/CSPResNeXt PyTorch implementation, which is the same as what you reported in your Github README file, including nn.BachNorm2d, nn.LeakyReLu, Training epochs, batch size and learning rate schedule. I also have a close look at your DarkNet PyTorch implementation. However, the ACC point is still below yours...
My Results:
Thanks!
@Chaimmoon
I am not sure it is important or not, I just follow https://pjreddie.com/darknet/imagenet/.
And I think gets a little bit lower accuracy is normal, since darknet use 256x256 for validation, and I guess your PyTorch code use 224x224 instead. My CSPDarknet53 PyTorch (224x224) implementation also gets 0.6% lower top-1 accuracy than Darknet (256x256) implementation.
Could you share your code of CSPResNet / CSPResNeXt, I would like to upload the implementation and results to pytorch branch if it is OK.
@WongKinYiu I'm sorry to bother you again.
I notice that the modified SAM in yolov4 paper is reference to the CBAM paper.
However, I also find that ThunderNet paper also design a SAM.
so I want to know:
The SAM in CBAM paper is same as the SAM in ThunderNet paper?
In yolov4 paper, the modified SAM is reference to the CBAM paper.
But in https://github.com/AlexeyAB/darknet/issues/3708#issuecomment-518583264, LukeAI said the [sam]
layer is for thundernet.
Are the two statements in conflict? which one is correct?
@nyj-ocean
There are many kind of channel attention module (CAM) spatial attention module (SAM) in the literature. For example SENet and SKNet proposed different kind of CAM, and CBAM and ThunderNet prposed different kind of SAM. In general, we will cite the first paper or the most similar paper or both in related work. So the answer of your question is:
- The SAM in CBAM paper is same as the SAM in ThunderNet paper?
No, they are different.
- In yolov4 paper, the modified SAM is reference to the CBAM paper. But in AlexeyAB/darknet#3708 (comment), LukeAI said the [sam] layer is for thundernet. Are the two statements in conflict? which one is correct?
The CBAM is the first paper which proposed SAM, we cite it in yolov4 paper. The ThunderNet prposed the most similar SAM module as ours, we cite it in cspnet paper.
SAM in CBAM:
SAM in ThunderNet:
@WongKinYiu Thanks for your reply. yolov4 paper modify SAM from spatial-wise attention to point-wise attention, So the SAM module before modified in yolov4 (that is spatial-wise attention ) is similar to the SAM module in CBAM paper?
yes, all of different kind of sam modules produce the attention of spatial.
@WongKinYiu thanks a lot
@Chaimmoon
I am not sure it is important or not, I just follow https://pjreddie.com/darknet/imagenet/.
And I think gets a little bit lower accuracy is normal, since darknet use 256x256 for validation, and I guess your PyTorch code use 224x224 instead. My CSPDarknet53 PyTorch (224x224) implementation also gets 0.6% lower top-1 accuracy than Darknet (256x256) implementation.
Could you share your code of CSPResNet / CSPResNeXt, I would like to upload the implementation and results to pytorch branch if it is OK.
Hi @WongKinYiu
Thanks for your reply! I think that during training and testing, the DarkNet framework keeps the image size as 256256. However, for common PyTorch training, the training size is 224224, and the test size is 256*256. Is my understanding right?
@Chaimmoon
it is depend on your code. the most common testing protocol in PyTorch is single-crop (224x224). https://pytorch.org/docs/stable/torchvision/models.html and the other common testing protocols nowadays are 10-crop (224x224 5-crop flip), 5-crop(224x224 * (center+ 4 corners)), and full (256x256).
@WongKinYiu
I'm sorry to bother you again.
I want to produce the picture about anchors of yolov3,like following . but I don't know how to do it.
Can you tell me how to produce this picture about anchors?
@nyj-ocean
i do not know too, i always use the anchors which yolo9000 calculated.
You can calculate new anchors by using this command:
./darknet detector calc_anchors coco.data -num_of_clusters 9 -width 512 -height 512 -show
@WongKinYiu thanks for your reply
@AlexeyAB
Thank you so much!!
It helps me a lot!
If the background color of cloud.png
is white, it will be better for me.
How can I change the background color of cloud.png
from black to white?
img = cv::Scalar::all(255);
https://github.com/AlexeyAB/darknet/blob/bef28445e57cd560fa3d0a24af98a562d289135b/src/image_opencv.cpp#L1472
cv::rectangle(img, pt1, pt2, CV_RGB(0, 0, 0), 1, 8, 0);
https://github.com/AlexeyAB/darknet/blob/bef28445e57cd560fa3d0a24af98a562d289135b/src/image_opencv.cpp#L1490
@AlexeyAB great! thanks a lot
@AlexeyAB
sorry to bother you again.
I ues the following command to generate my cloud.png
on my own dataset.
./darknet detector calc_anchors my-own-dataset.data -num_of_clusters 9 -width 608 -height 608 -show
The following figure is my cloud.png
I find that there are many black spare parts in my own clond.png
However, there is almost no black spare parts in cloud.png
of coco dataset. The anchor almost fills the whole cloud.png
of coco dataset (seen https://github.com/WongKinYiu/CrossStagePartialNetworks/issues/24#issuecomment-627941826)
Is there any problem with my ownclond.png
?
or is there any problem with my anchor that I generated on my own dataset?
How can I eliminate the black spare parts in my own clond.png
?
i guess images in your dataset are form videos.
What is the black spare? There is no problem.
@AlexeyAB
Theblack spare
parts is like the following:
there are many black spare parts in my ownclond.png
However, there is almost no black spare parts in cloud.png of coco dataset. (seen https://github.com/WongKinYiu/CrossStagePartialNetworks/issues/24#issuecomment-627941826)
Why are there many black spare parts in my own cloud.png
?
Is it normal?
I want to eliminate these black spare parts in my own clond.png. How can I eliminate these black spare parts?
@WongKinYiu The images in my dataset are not taken from videos
Why are there many black spare parts in my own cloud.png ?
Because your objects are small relative to the image size. This is normal.
Just may be you should use higher network resolution for anchors calculation, training and detection to get good results.
https://github.com/AlexeyAB/darknet#how-to-improve-object-detection
Only if you are an expert in neural detection networks - recalculate anchors for your dataset for width and height from cfg-file: darknet.exe detector calc_anchors data/obj.data -num_of_clusters 9 -width 416 -height 416 then set the same 9 anchors in each of 3 [yolo]-layers in your cfg-file. But you should change indexes of anchors masks= for each [yolo]-layer, so that 1st-[yolo]-layer has anchors larger than 60x60, 2nd larger than 30x30, 3rd remaining. Also you should change the filters=(classes + 5)*
before each [yolo]-layer. If many of the calculated anchors do not fit under the appropriate layers - then just try using all the default anchors.
@AlexeyAB Thank you so much
@AlexeyAB
sorry to bother you again.
./darknet detector calc_anchors coco.data -num_of_clusters 9 -width 512 -height 512 -show
It will create cloud.png
If it can createcloud.eps
, it will be better for me.
How can I change the cloud.png
from png
to eps
?
@WongKinYiu
Hi, I have checked the network structure and number of parameters in my CSPResNet/CSPResNeXt PyTorch implementation, which is the same as what you reported in your Github README file, including nn.BachNorm2d, nn.LeakyReLu, Training epochs, batch size and learning rate schedule. I also have a close look at your DarkNet PyTorch implementation. However, the ACC point is still below yours...
My Results:
- CSPResNet50: Prec@1 75.772 Prec@5 92.716 (Paper results: 76.6 % 93.3%)
- CSPResNeXt50: Prec@1 76.328 Prec@5 93.058 (Paper results: 77.9 % 94.0%)
Thanks!
@Chaimmoon Could you share me your code of CSPResNet50? Thank you.
@WongKinYiu
I'm sorry to bother you again.
I have another question about SAM
module
yolov4 paper modify SAM from spatial-wise
attention to point-wise
attention,
I can not fully understand that yolov4 modify SAM from spatial-wise attention to point-wise attention
.
Does it mean that yolov4 modify SAM from Max-pooling and Average-Pooling to Convolution layers
?
What is point-wise attention
?
Is the point-wise attention
equal to the convolution layer
?
channel-wise
: each channel has one attention 1x1xc.
spatial-wise
: each position has one attention wxhx1.
point-wise
: each feature point has one attention wxhxc.
@WongKinYiu
Thanks for your reply.
what I understand about yolov4 modify SAM from spatial-wise attention to point-wise attention
is that is yolov4 use a 1*1 convolution layer
replace the maxpool ,avgpool ,7*7 convolution layer
,just like the following:
Is my understanding correct?
2.If my understanding is correct, can you tell me why yolov4 modify SAM from spatial-wise attention to point-wise attention ? What are the benefits of making this modify? Is it to reduce inference time?https://github.com/AlexeyAB/darknet/issues/3708#issuecomment-528140698
These questions are very troubling to me. I look forward to your answers.Thanks a lot
Hi,
In ImageNet Experiments, the paper said that it should be trained for 800 epochs:
However, in the code, it said that it should be trained for 80 epochs:
So there is a big difference……
Besides, I try to re-implement in PyTorch, and the ACC is 7~8 points behind your method. The network architecture and number of parameters is the same as your Darknet results……
Best, Mu