KhronosGroup / NNEF-Tools

The NNEF Tools repository contains tools to generate and consume NNEF documents
https://www.khronos.org/nnef
222 stars 57 forks source link

About the format of NNEF #29

Closed mei-chiao closed 6 years ago

mei-chiao commented 6 years ago

I want to ask the question about the consistency of the stride format of NNEF between conv() and max_pool(). In the layer one of Alexnet has conv() and max_pool() function .
Those two functions need 4-D stride tensor. But the format is not consistency.
conv ==> stride = [4, 4] max_pool ==>stride = [1, 1, 2, 2]

See as below.

This question may be puzzle the driver.

Thanks

version 1.0

graph network( input ) -> ( output ) { input = external(shape = [1, 1, 227, 227]); weights1 = variable(shape = [96, 1, 11, 11], label = 'Convolution-Layer-1/weights'); biases1 = variable(shape = [1, 96], label = 'Convolution-Layer-1/biases'); conv1 = conv(input, weights1, biases1, padding = [(0, 0), (0, 0)], border = 'constant', stride = [4, 4], dilation = [1, 1]); const1 = constant(shape = [4], value = [1.0, 55.0, 55.0, 96.0]); reshape1 = reshape(conv1, shape = [1, 96, 55, 55]); relu1 = relu(reshape1); norm1 = local_response_normalization(relu1, size = [1, 5, 1, 1], alpha = 0.0001, beta = 0.75, bias = 1.0); pool1 = max_pool(norm1, size = [1, 1, 3, 3], padding = [(0, 0), (0, 0), (0, 0), (0, 0)], border = 'ignore', stride = [1, 1, 2, 2]);

 

gyenesvi commented 6 years ago

Hi,

Thanks for the feedback!

There is a difference between conv() and max_pool() operations: convolution is only performed in spatial dimensions (height and width), while pooling can be performed in all 4 dimensions. That is why 2d convolution requires only 2 stride parameters, while pooling requires 4 stride parameters.

This is stated in the spec, and the wording will be improved in the final version to clarify this and to make the required length of those parameter arrays better defined for the parser/driver.

mei-chiao commented 6 years ago

I understand. And ,thanks your clear description.

By the way, I wonder the field of padding format for convolution and pooling operation. The NNEF convolution syntax as below. conv2 = conv(pool1, kernel2, bias2, padding = [(1,2), (3,4)], border = 'constant', stride = [1, 1], dilation = [1, 1])
So the padding syntax is "padding = [(1,2), (3,4)]" I have zero padding, if the "padding = [(1,2), (3,4)]" is like below.

image

gyenesvi commented 6 years ago

Yes, you a re correct, this is the format of padding.

mei-chiao commented 6 years ago

Thanks your information.