adobe / antialiased-cnns

pip install antialiased-cnns to improve stability and accuracy
https://richzhang.github.io/antialiased-cnns/
Other
1.66k stars 202 forks source link

Larger strides/downsampling factors #37

Open chanshing opened 4 years ago

chanshing commented 4 years ago

First, thanks for the very nice work!

In your implementation as well as in the paper, it seems that the proposed filters (which are the binomial coefficients) are only valid for strides/downsampling factors of 2. Extrapolating from this, does it mean that I need to use the trinomial coefficients for stride 3, quadrinomial coefficients for stride 4, and so on?

By the way, you could simplify your code in downsample.py using scipy.special.binom instead of hard-coding each filter. Something like a = np.asarray([binom(filt_size-1, i) for i in range(filt_size)]) which will take care of arbitrary filt_size

richzhang commented 4 years ago

Yes the width of the filter can be computed automatically based on downsampling factor. The binom filter shouldn't be used for larger filter sizes and stride 2. I'm working on larger strides

chanshing commented 4 years ago

I implemented something for the 1D case, since I'm doing timeseries stuff mostly. The idea is to define the filter order instead of the filter size, and start from the box filter (average pooling) of width equal to the down factor. This will be order 0. Then for the higher order filters we convolve the box filter with itself order times (See http://nghiaho.com/?p=1159)

For example, for stride 2 and order 3, we start with the box filter [1, 1]. Convolve with itself once and we get [1, 2, 1] (order 1). Once more and we get [1, 3, 3, 1] (order 2). Once again and we get [1, 4, 6, 4, 1] (order 3). Example with stride 3: [1, 1, 1] -> [1, 2, 3, 2, 1] -> [1, 3, 6, 7, 6, 3, 1] -> ...

Here's more or less what I did (again, this is only 1D):

box_kernel = np.ones(factor)
kernel = np.ones(factor)
for _ in range(order):
    kernel = np.convolve(kernel, box_kernel)
kernel /= np.sum(kernel)
kernel = torch.Tensor(kernel)
self.register_buffer('kernel', kernel[None, None, :].repeat((channels, 1, 1)))

For 2D and beyond we could take the outer product of the 1D filter with itself.