Open sainisanjay opened 5 years ago
When you use a 2-d convolution, you operate on a 3-d tensor (e.g. a color image input with a certain width and height has a depth of 3 for the 3 color channels). The 3-d convolution expects a 4-d tensor or it would fail the same way a 2-d convolution would if it only got a 2-d tensor.
1) Since after VFE layer we will get 4D feature map. Than how we are reshaping to 3D?? @lengly @ring00 @abhigoku10 @jeasinema