Closed Emilycs09 closed 4 years ago
hi, thank you for your remind. This is not a problem, if you set the input size as even number, these two sizes will be the same. however, If you set the crop size as odd number (like 769 x 769), you need to modify this code
I understand what you mean, but in val or test process, the input images are not cropped and they will have different sizes, both even or odd number are possible. In this case problem happened.
To avoid this, I change your code a little bit like this: original: self.max_pool = nn.MaxPool2d(2, stride=2) edited: self.max_pool = nn.MaxPool2d(3, stride=2, padding=1)
By changing the kernel size of the maxpool layer and add padding, the output of the maxpool and conv will always be the same. But this change the network structure, I'm not sure whether it will affect the perfomance or not.
Thank you for your suggestion, you can test the performance after the modification. The downsampling block refers to the ENet paper, so the pooling layer size is the same with ENet as 2x2.
hello,
Your work has been very helpful!
I found a problem in DABNet.py at line 98: output = torch.cat([output, max_pool], 1). As shown, this line concatenates the output of the conv3x3(nIn, nConv, kSize=3, stride=2, padding=1) and MaxPool2d(2, stride=2), but this two outputs could have different dimensions in [h, w].
According to the definition of PyTorch docs, the output dimension of both Conv2d and MaxPool2d are below:
so the output of conv would be: floor((h-1)/2+1) while the output of pool would be: floor((h-2)/2+1) and this cause breakdown.