Question About Attentive Feature Network

ccJia commented 6 years ago

Hi Liu, I want make sure the construct of AF-Net. Could you help me check it? I modified the branch behind "ch_concat_3a_chconcat" layer and I just use L = 4.

And this is my prototxt for caffe .

layer { name: "ch_concat_3a_chconcat" type: "Concat" bottom: "conv_3a_1x1" bottom: "conv_3a_3x3" bottom: "conv_3a_double_3x3_1" bottom: "conv_3a_proj" top: "ch_concat_3a_chconcat" }

layer { name: "attention_conv_3b_1x1" type: "Convolution" bottom: "ch_concat_3a_chconcat" top: "attention_conv_3b_1x1" convolution_param { num_output: 4 kernel_size: 1 stride: 1 pad: 0 } }

layer { name: "slice_attention_conv_3b_1x1" type: "Slice" bottom: "attention_conv_3b_1x1" top: "slice_attention_conv_3b_1x1_0" top: "slice_attention_conv_3b_1x1_1" top: "slice_attention_conv_3b_1x1_2" top: "slice_attention_conv_3b_1x1_3"

slice_param { axis: 1 slice_point: 1 slice_point: 2 slice_point: 3 slice_point: 4 } }

layer { name: "attention_mul_feature_0" type: "Eltwise" bottom: "ch_concat_3a_chconcat" bottom: "slice_attention_conv_3b_1x1_0" top: "attention_mul_feature_0" eltwise_param { operation: PROD } } layer { name: "attention_mul_feature_1" type: "Eltwise" bottom: "ch_concat_3a_chconcat" bottom: "slice_attention_conv_3b_1x1_1" top: "attention_mul_feature_1" eltwise_param { operation: PROD } } layer { name: "attention_mul_feature_2" type: "Eltwise" bottom: "ch_concat_3a_chconcat" bottom: "slice_attention_conv_3b_1x1_2" top: "attention_mul_feature_2" eltwise_param { operation: PROD } } layer { name: "attention_mul_feature_3" type: "Eltwise" bottom: "ch_concat_3a_chconcat" bottom: "slice_attention_conv_3b_1x1_3" top: "attention_mul_feature_3" eltwise_param { operation: PROD } } layer { name: "attention_3a_chconcat" type: "Concat" bottom: "attention_mul_feature_0" bottom: "attention_mul_feature_1" bottom: "attention_mul_feature_2" bottom: "attention_mul_feature_3" top: "attention_3a_chconcat" } Thank you.

xh-liu commented 6 years ago

Hi, Your basic structure is correct, however there are some parts which is different from my implementation:

The layers from attention_mul_feature_0 to attention_mul_feature_3 should be element wise multiplication of ch_concat_3a_chconcat and slice_attention_conv_3b_1x1_tile, where slice_attention_conv_3b_1x1_tile is tiled to be have the same number of channels as ch_concat_3a_chconcat. Otherwise the dimension between ch_concat_3a_chconcat and slice_attention_conv_3b_1x1 mismatches, and there will be error in element wise production.
I did not concat attention_mul_feature_0 to attention_mul_feature_3 to get attention_3a_chconcat and pass through the following blocks, but just let concat attention_mul_feature_0 to concat attention_mul_feature_3 pass through the following blocks respectively. Hope this will help you! Best, Xihui

ccJia commented 6 years ago

Hi,

According to the snapshot above, the input "F" is [C,H,W] and the output attention map "a" is [L,H,W]. In this case , L, as your suggestions is 8. My question is how to do the element-wise multiplication between "F" and "a"? If I understand , I will get one slice of "a" and do the element-wise multiplication for each channel of "F". Or “a” is indeed [L*C,H,W], we could get L copies of attention maps and each with [C,H,W], then we can perform element-wise multiplication.

ccJia commented 6 years ago

@xh-liu Could you help us ? T-T

Li1991 commented 6 years ago

Hi, have you re-implemented this paper? Can you give a prototxt example? Thank you very much! @ccJia

ccJia commented 6 years ago

@Li1991 I haven't finished it . The AF Net is confusing me.....And I don't know how to implement it.

bilipa commented 6 years ago

@ccJia It may said: for each channel for attention map, use it to multiplicate with F. It can see from Fig 4

xh-liu commented 6 years ago

@ccJia Yes, your understanding is right. We get one slice of "a" and do the element-wise multiplication for each channel of "F".

ccJia commented 6 years ago

@xh-liu Thank you ^-^ !

hezhenjun123 commented 6 years ago

@xh-liu Hi,it seems that the @ccJia prototxt has some error else,i think the number of each output about the "a" element-wise multiplicate with the "F" is 24(38),and the total number output to the GAP is 72(243).Which means each of the "a" need element-wise multiplicate with the "F1,F2 and F3",is that right?

hezhenjun123 commented 6 years ago

@xh-liu 24(3x8) and 72(24x3),sorry for the typewriting...

bilipa commented 6 years ago

@hezhenjun123 I think the GAP input is (24x3 + 1) which mean (hydra, plus)

hezhenjun123 commented 6 years ago

@bilipa yeah, i think you are right!

xh-liu / HydraPlus-Net

Question About Attentive Feature Network #5