About the format of NNEF

mei-chiao commented 6 years ago

I want to ask the question about the consistency of the stride format of NNEF between conv() and max_pool(). In the layer one of Alexnet has conv() and max_pool() function .
Those two functions need 4-D stride tensor. But the format is not consistency.
conv ==> stride = [4, 4] max_pool ==>stride = [1, 1, 2, 2]

See as below.

This question may be puzzle the driver.

Thanks

version 1.0

graph network( input ) -> ( output ) { input = external(shape = [1, 1, 227, 227]); weights1 = variable(shape = [96, 1, 11, 11], label = 'Convolution-Layer-1/weights'); biases1 = variable(shape = [1, 96], label = 'Convolution-Layer-1/biases'); conv1 = conv(input, weights1, biases1, padding = [(0, 0), (0, 0)], border = 'constant', stride = [4, 4], dilation = [1, 1]); const1 = constant(shape = [4], value = [1.0, 55.0, 55.0, 96.0]); reshape1 = reshape(conv1, shape = [1, 96, 55, 55]); relu1 = relu(reshape1); norm1 = local_response_normalization(relu1, size = [1, 5, 1, 1], alpha = 0.0001, beta = 0.75, bias = 1.0); pool1 = max_pool(norm1, size = [1, 1, 3, 3], padding = [(0, 0), (0, 0), (0, 0), (0, 0)], border = 'ignore', stride = [1, 1, 2, 2]);

gyenesvi commented 6 years ago

Hi,

Thanks for the feedback!

There is a difference between conv() and max_pool() operations: convolution is only performed in spatial dimensions (height and width), while pooling can be performed in all 4 dimensions. That is why 2d convolution requires only 2 stride parameters, while pooling requires 4 stride parameters.

This is stated in the spec, and the wording will be improved in the final version to clarify this and to make the required length of those parameter arrays better defined for the parser/driver.

mei-chiao commented 6 years ago

I understand. And ,thanks your clear description.

By the way, I wonder the field of padding format for convolution and pooling operation. The NNEF convolution syntax as below. conv2 = conv(pool1, kernel2, bias2, padding = [(1,2), (3,4)], border = 'constant', stride = [1, 1], dilation = [1, 1])
So the padding syntax is "padding = [(1,2), (3,4)]" I have zero padding, if the "padding = [(1,2), (3,4)]" is like below.

gyenesvi commented 6 years ago

Yes, you a re correct, this is the format of padding.

mei-chiao commented 6 years ago

Thanks your information.

KhronosGroup / NNEF-Tools

About the format of NNEF #29

Thanks