Dear authors,
in your paper, you describe the regressor $g(\cdot): \mathbb{R}^{N} \rightarrow [-1, 1]$ architecture as composed of three fully connected layers (multi-layer perceptron). See the images below for reference.
Fully connected layers are usually implemented using a linear layer. Do you perform some slicing/viewing of the tensors, which makes the implementation equivalent to a linear layer? Or do you use convolutions here by mistake? Can you please elaborate on this?
I suppose that the input of the Regressor is of shape [B, 1536, 1, 1], where B is the mini-batch size and the spatial dimensions are 1, 1. Is this correct?
Dear authors, in your paper, you describe the regressor $g(\cdot): \mathbb{R}^{N} \rightarrow [-1, 1]$ architecture as composed of three fully connected layers (multi-layer perceptron). See the images below for reference.
However, in this repository, the regressor is implemented using convolutions with kernel size 1. https://github.com/nhshin-mcl/MWR/blob/d070db454a6e7ae93426a9f220464f39f9187445/code/Network.py#L18-L27 https://github.com/nhshin-mcl/MWR/blob/d070db454a6e7ae93426a9f220464f39f9187445/code/Network.py#L48
Fully connected layers are usually implemented using a linear layer. Do you perform some slicing/viewing of the tensors, which makes the implementation equivalent to a linear layer? Or do you use convolutions here by mistake? Can you please elaborate on this?
Jakub