locuslab / TCN

Sequence modeling benchmarks and temporal convolutional networks
https://github.com/locuslab/TCN
MIT License
4.18k stars 881 forks source link

how to calculate the TCN receptive area? #44

Closed cswwp closed 4 years ago

cswwp commented 4 years ago

If i want to use TCN with a long step length, how i calculate the receptive area to guide the num_channels design?

jerrybai1995 commented 4 years ago

The receptive area grows roughly exponentially if you used the default setting, because 2^i - 1 = 2^0 + 2^1 + ... + 2^{i-1}. You can use that as an estimate.

cswwp commented 4 years ago

The receptive area grows roughly exponentially if you used the default setting, because 2^i - 1 = 2^0 + 2^1 + ... + 2^{i-1}. You can use that as an estimate.

Thank you, get it :)

shlomish3 commented 4 years ago

Hi! I was inspired by your wonderful work to use TCN for my project. I wrote a short Matlab script supposed to calculate the effective receptive field. Element j in the output vector RF shows the receptive field for a network with j hidden layers. Could please evaluate this code below so other people could hopefully use it? @jerrybai1995

k = 6; %Kernel size
n = 7; %num of hidden layers
d = 2; %dilation factor

num_layers = 1:n+1; % hidden+input
dilation = d.^(num_layers - 1); % dilation at each hidden layer
RF = zeros(1,length(num_layers));
RF(1) = k; % first RF is kernel size
for layer = 2:length(dilation) % repeat for num of hidden layers - 1, beginning at second hidden layer
    RF(layer) = RF(layer - 1) + (k - 1)*dilation(layer); 
end
david-waterworth commented 4 years ago

That looks right to me, I believe the equation is as follows

image

k = kernel size n = hidden layers d = dilation factor

youcaiSUN commented 4 years ago

@david-waterworth, I notice that there are 2 consecutive dilated convolution layers in each residual block, so the real receptive field =2 * the value calculated by your equation, am I right?

shlomish3 commented 4 years ago

I believe not. If you look at the tcn.py code, it shows that the number of residual blocks is as the length of hidden_layers

david-waterworth commented 4 years ago

You're correct that there are 2 convs per residual block. But I believe they have the same dimension so they don't increase the receptive area. Also you wouldn't double the value, the receptive area doesn't increase multiplicatively per layer, it's additive - i.e. each layer increases the receptive area by the length of 1 conv filter less one step.

youcaiSUN commented 4 years ago

You're correct that there are 2 convs per residual block. But I believe they have the same dimension so they don't increase the receptive area. Also you wouldn't double the value, the receptive area doesn't increase multiplicatively per layer, it's additive - i.e. each layer increases the receptive area by the length of 1 conv filter less one step.

Hi, david. Thanks for your reply. It's right that the receptive area increases additively per layer. But I think that double layers with the same dimension do increase the receptive area. In the trivial case, suppose that there are 2 same conv1d layers with kernal size=3 and dilation=1, so the receptive area of the first conv layer is 3 (i.e. 1+1(3-1), the middle "1" denotes dilation), and the second is 5 (i.e. 1+1(3-1)+1(3-1)=1+2 1*(3-1)). By the way, you missed 1 in the equation.

david-waterworth commented 4 years ago

Yes you're correct. I'm going to have to draw it up on the whiteboard again :)

johnsyin commented 4 years ago

I have wrote a function for this

def get_receptive(kernel_size, levels, dilation_exponential_base): 
    return sum([dilation_exponential_base**(l-1)*(kernel_size-1) for l in range(levels, 0, -1)]) + 1