kiddyboots216 / CommEfficient

PyTorch for benchmarking communication-efficient distributed SGD optimization algorithms
71 stars 20 forks source link

Resnet9 Pooling #5

Closed howard-yen closed 3 years ago

howard-yen commented 3 years ago

Hi authors,

While I was training on CIFAR10 using Resnet9 (from models/resnet9.py) with default settings(i.e. the channel sizes), I got an error on the last pooling layer out = self.pool(out).view(out.size()[0], -1) in BasicNet.forward.

After printing out the tensor size of the input into the pooling layer, I found that the input was of size [batch_size, 512, 3, 3]. I fixed this error by changing the last pooling layer to self.pool = nn.MaxPool2d(2), which made sense since the last linear layer expects a tensor of size [batch_size, 512].

Is this an error that has happened before or could there have been something else that went wrong? I see that the last pooling layer was hardcoded to be nn.MaxPool2d(4), so was the expected input of size [batch_size, 512, 5, 5] and I did something wrong?

Thanks, Howard

kiddyboots216 commented 3 years ago

I've never seen this error before. Can you share the command that you're running?

howard-yen commented 3 years ago

Thank you for the reply! It appears the error when on my end when I had applied cropping to the dataset to 24 x 24 instead of 32 x 32 so there wasn't anything wrong with the repo's implementation. Thanks again!