Open rauldiaz opened 6 years ago
It took me a while to understand that as well, since it is not directly explained in the paper. If you are using 1x1 convolutions you actually get a dense connection between all points using the same weight. That's how you share those weights and biases through all points using just one joint weight. If you have like 32 feature maps at the end, you kind of generate 32 of those dense 1x1 conv neurons and by stacking those 1x1 conv layers you actually get something, which somehow looks similar to a mlp
You can further read the paper Network in Network, for better intuition https://arxiv.org/abs/1312.4400
etwork can you briefly elaborate how this paper (NIN) is related to the concept of shared-mlp?
Hi,
Your paper indicates that the MLP layers are shared. However, I can't seem to understand where are you doing that in the code. Do they share weights and biases? How? They look to me like a regular stack of conv2 operations. Perhaps I´m reading it wrong.
Thanks Raul