zhang-pengyu / ADRNet

code and results for 'Learning Adaptive Attribute-Driven Representation for Real-Time RGB-T Tracking'
45 stars 4 forks source link

About CENet #12

Open easycodesniper-afk opened 2 years ago

easycodesniper-afk commented 2 years ago

Great work! I have a question about your CENet mentioned in paper. Why did you use method of summation and 5 fc layers to handle five residual features rather than method of concat and 1 fc layer? It seems quite novel but hard to understand. Wish your reply!

zhang-pengyu commented 2 years ago

Thanks for your attention. The main question seems that if we can use concat + 1 fc layer to replace 5 fc layers. Theoretically, we aim to achieve attribute selection in channel level. Each fc layer can generate its individual weight. Then a softmax is operated among attributes. Using 1 layer seems can achieve the same effort. However, as for coding, if using one layer, the softmax operation cannot be done and the dimension multiplies with lower speed.

easycodesniper-afk commented 2 years ago

Thanks for your attention. The main question seems that if we can use concat + 1 fc layer to replace 5 fc layers. Theoretically, we aim to achieve attribute selection in channel level. Each fc layer can generate its individual weight. Then a softmax is operated among attributes. Using 1 layer seems can achieve the same effort. However, as for coding, if using one layer, the softmax operation cannot be done and the dimension multiplies with lower speed.

Thanks for your prompt reply! I understand calculate costs and why you used 5 fc layers. But what if we try sigmoid function like Squeeze-and-Excitation Networks and 1 fc layers? Have you compared these two functions? Forgive my poor knowledge reserve... Wish your reply, sincerely.

zhang-pengyu commented 2 years ago

Sorry for late response. I can't understand how to use sigmoid and 1 fc to act as SENet. Can you provide more detailed description?

easycodesniper-afk commented 2 years ago

不好意思这么久才看到,我的意思是之前有在别的文章里看到将RGB和Tir模态的特征Concat之后送进SENet后学习出一个权重这样的操作,感觉逻辑上比较好理解,因为RGB和Tir在通道级别是独立的;ADRNet将多种分支的特征做Summation后,相当于是每个channel中都包含4种挑战和1种通用特征的信息,用了平均池化后再用5个全连接层去学习每种挑战对应的属性,最后用了一个softmax函数进行概率估计?不知道我描述的是否准确...我比较难理解为什么使用summation融合后的特征可以学习到每个挑战分支输出特征的对应的权重,是训练过程使网络学到的吗? 盼望您的回复!

zhang-pengyu commented 2 years ago

对的,我们想要让各自的selection部分的全连接层学习到各自分支对应的权重,可能和SENet的操作有一定区别,可以参考SKNet[Selective Kernel Networks, CVPR 2019]可能能更好的理解~

easycodesniper-afk commented 2 years ago

对的,我们想要让各自的selection部分的全连接层学习到各自分支对应的权重,可能和SENet的操作有一定区别,可以参考SKNet[Selective Kernel Networks, CVPR 2019]可能能更好的理解~

我已经理解了,只看了邮件忘记在github上进行回复了,感谢鹏宇师兄指点。

aibc-hp commented 2 years ago

对的,我们想要让各自的selection部分的全连接层学习到各自分支对应的权重,可能和SENet的操作有一定区别,可以参考SKNet[Selective Kernel Networks, CVPR 2019]可能能更好的理解~

我已经理解了,只看了邮件忘记在github上进行回复了,感谢鹏宇师兄指点。

我在单独训练conv4_EI网络层参数的时候,显示的 mean precision 和 inter loss 都在一个值上下波动,无法收敛; 将学习率从0.00002调到0.0001也没有改变。你遇到过这样的问题吗?

RenjunL-xdu commented 1 year ago

对的,我们想要让各自的selection部分的全连接层学习到各自分支对应的权重,可能和SENet的操作有一定区别,可以参考SKNet[Selective Kernel Networks, CVPR 2019]可能能更好的理解~

我已经理解了,只看了邮件忘记在github上进行回复了,感谢鹏宇师兄指点。

我在单独训练conv4_EI网络层参数的时候,显示的 mean precision 和 inter loss 都在一个值上下波动,无法收敛; 将学习率从0.00002调到0.0001也没有改变。你遇到过这样的问题吗?

我也遇到了这个问题,我设置的batch是64,如果我把batch设置小一些,比如4,收敛速度就会快一些