Open lychahaha opened 6 years ago
Yes. For an ORConv operation, the size of inputs is [nBatch x nInputChannel x nOrientation x H x W]
, the size of ARFs is [nOutputChannel x nInputChannel x nOrientation x kH x kW]
, and the size of outputs is [nBatch x nOutputChannel x nRotation x H x W]
.
Okey,thanks. By the way, something in the paper confuses me.
@ljhandlwt ,
In ORN, feature maps and filters (ARFs) are vector fields that explicitly encode orientation information. Coordinate Rotation and Orientation Spin are steps in rotating a vector field. Here is a simple illustration:
For more details, please check Sec 3.1 in the paper.
@ZhouYanzhao ,
According to Sec3.1 and Fig2, an ARF has a shape of [W,W,N]
.
So an arrow in Fig2.ARF or your picture means a number, is it right?
@ljhandlwt ARFs are viewed as N-directional points on a grid (vector fields). For each arrow, the length represents its value (a number), and the angle indicates the corresponding orientation channel.
@ZhouYanzhao oh I guess I understand your words eventually. "N-directional points" means an activation value at (x,y), isn't it? The activation value is a scalar for canonical conv filters, but now it's an N-dimensional vector, am I right?
according to my understanding, nOrientation means the dim of each point(unit) in the feature map or kernel, and nRotation means the copy-rotated number of the kernel. Points in the feature map or kernel, are not always a scalar but a vector, or a n-dim point as your paper say.
So, ORConv2d(1,10,arf_config=(1,8), kernel_size=3) means input is with 1 channel, where points are scalars, and conv_kernel is with 1 in_ch, 10 out_ch, 8 rotated copy, where points are scalars, too. ORConv2d(10,20,arf_config=8, kernel_size=3) means input is with 10 channel, where points are 8-dim vector, and conv_kernel is with 10 in_ch, 20 out_ch, 8 rotated copy, where points are 8-dim vector, too.
As a word, nOrientation is the dim of points in input, while nRotation is the dim of points in output.
Is it right?