Closed ocrhei closed 2 years ago
In the original paper, there are indeed three 3x3 convolution layers in each head. However, according to the author of FCENet, using one 3x3 convolution layer in each head makes no difference and the accuracy are almost the same. So we just use one here.
谢谢您,回复非常及时
为啥代码head中的每一部分只有一个3✖️3卷积呢,而论文说的是3个