ZhouYanzhao / ORN

Oriented Response Networks, in CVPR 2017
http://yzhou.work/ORN
BSD 3-Clause "New" or "Revised" License
223 stars 51 forks source link

the meaning of nOrientation and nRotation #10

Open lychahaha opened 6 years ago

lychahaha commented 6 years ago

according to my understanding, nOrientation means the dim of each point(unit) in the feature map or kernel, and nRotation means the copy-rotated number of the kernel. Points in the feature map or kernel, are not always a scalar but a vector, or a n-dim point as your paper say.

So, ORConv2d(1,10,arf_config=(1,8), kernel_size=3) means input is with 1 channel, where points are scalars, and conv_kernel is with 1 in_ch, 10 out_ch, 8 rotated copy, where points are scalars, too. ORConv2d(10,20,arf_config=8, kernel_size=3) means input is with 10 channel, where points are 8-dim vector, and conv_kernel is with 10 in_ch, 20 out_ch, 8 rotated copy, where points are 8-dim vector, too.

As a word, nOrientation is the dim of points in input, while nRotation is the dim of points in output.

Is it right?

ZhouYanzhao commented 6 years ago

Yes. For an ORConv operation, the size of inputs is [nBatch x nInputChannel x nOrientation x H x W], the size of ARFs is [nOutputChannel x nInputChannel x nOrientation x kH x kW], and the size of outputs is [nBatch x nOutputChannel x nRotation x H x W].

lychahaha commented 6 years ago

Okey,thanks. By the way, something in the paper confuses me.

  1. Orientation spin. How is a n-dim point(vector) spined? What's the rotation axis?(what is the definition of α of F'θpq(α) in the paper?)
  2. The paper says F'θ,pq is a sample of function F'θ,pq(α), and F'θ,pq(α) is a periodic function. But what do the rest N-1 F'θ,pq(x) means? It doesn't seem they are some P_dst which is rotated by some P_src.
ZhouYanzhao commented 6 years ago

@ljhandlwt , In ORN, feature maps and filters (ARFs) are vector fields that explicitly encode orientation information. Coordinate Rotation and Orientation Spin are steps in rotating a vector field. Here is a simple illustration: how to rotate ARFs For more details, please check Sec 3.1 in the paper.

lychahaha commented 6 years ago

@ZhouYanzhao , According to Sec3.1 and Fig2, an ARF has a shape of [W,W,N]. So an arrow in Fig2.ARF or your picture means a number, is it right?

ZhouYanzhao commented 6 years ago

@ljhandlwt ARFs are viewed as N-directional points on a grid (vector fields). For each arrow, the length represents its value (a number), and the angle indicates the corresponding orientation channel.

askerlee commented 5 years ago

@ZhouYanzhao oh I guess I understand your words eventually. "N-directional points" means an activation value at (x,y), isn't it? The activation value is a scalar for canonical conv filters, but now it's an N-dimensional vector, am I right?