Code Review - Githubissues

1) As you already mentioned I followed the order given in the paper: convolution-bn-relu 2) Regarding the comparison with official implementation of the mnasnet given in torchvision, as you can see it is mnasnet-b1:

https://github.com/pytorch/vision/blob/f6a3e0c3ca41125461ea2484936367d40d0ad34f/torchvision/models/mnasnet.py#L84-L86

I already compared the architectures of mnasnet-b1 of my implementation and official one, there is no difference in terms of architecture structure and number of parameters.

The issues I mentioned is about mnasnet-a1, it is 3.9M parameters given in the paper but my implementation results in about 3.8M parameters.

3) For mnasnet-a1 (expansion coefficients are from the paper), and in case of mnasnet-b1 they are same with official implemented in torchvision

4) Squeeze-and-excitation:

As you can see from the figure above, there is no skip connection in the SE block Figure 7 (b). You referred to the original paper, introducing the SE block:

It is more about Inception module, rather than residual module in case of inverted mobilenet bottleneck, and actually there is a residual connection as shown in Figure 7 (b).

5) There is no clear reason for adaptive average pooling choice actually, will think about it.

azamatkhid commented 4 years ago

@Magauiya thanks for comments. Those are my replies:

I will take into account your advice to make code more compact (to do)
Done
Good point, will add comments in the code to make it more readable (to do)
I will add README file with all explanation with visualizations (to do)
This is the next steps: train, validate and test on datasets, you mentioned in the comment.

azamatkhid / mnasnet-pytorch

Code Review #1