Open aye0804 opened 4 years ago
The paper use average pooling to aggregate β × C features in one node to be 1 × C, but the code shows max pooling: maxpool = nn.MaxPool2d(kernel_size=(patch_w, patch_h), stride=(patch_w, patch_h), padding=0, ceil_mode=True)
After the paper, we found that max-pooling can achieve better performance. We will change the paper version to the Arxiv.
The paper use average pooling to aggregate β × C features in one node to be 1 × C, but the code shows max pooling: maxpool = nn.MaxPool2d(kernel_size=(patch_w, patch_h), stride=(patch_w, patch_h), padding=0, ceil_mode=True)