KhronosGroup / NNEF-Docs

NNEF public repository
Apache License 2.0
14 stars 3 forks source link

border 'constant' value must be -inf as a default value for max_pool #18

Closed syoyo closed 5 years ago

syoyo commented 5 years ago

'constant' : the border is considered constant, currently the only supported value is 0

The spec says supported border 'constant' value is 0, but we need to set this to -inf as a default for max_pool.

Otherwise, we could not calculate correct max value from negative input tensor values for max_pooling with padding >= 1

gyenesvi commented 5 years ago

In case of pooling, you should use border = 'ignore' if you do not want to include the border values in the calculation. That will have the same effect as if the value was -inf for max pooling. Also, note that ignoring works for average pooling as well, since in that case you could not choose an appropriate constant value that does not effect the average, but ignoring the border will give you the right average.

syoyo commented 5 years ago

NNEF spec says border constant is the default for max_pool.

fragment max_pool(
    input: tensor<scalar>,
    size: integer[],
    border: string = 'constant',
    padding: (integer,integer)[] = [],
    stride: integer[] = [],
    dilation: integer[] = [] )
-> ( output: tensor<scalar> )
{
    output, index = max_pool_with_index(input, size = size, border = border,
                                        padding = padding, stride = stride,
                                        dilation = dilation);
}

For some ML framework(as far as I know, at least Chainer ) consider border pixels when padding >= 1 for pooling operation:

https://docs.chainer.org/en/stable/reference/generated/chainer.functions.max_pooling_2d.html https://docs.chainer.org/en/stable/reference/generated/chainer.functions.average_pooling_2d.html

For max pooling in chainer, this is equivalent to pad op with -inf and max_pool2d op with padding mode VALID in TensorFlow, since TensorFlow's max_pool op does not consider padded pixel when computing pooling.

So, my question is, first of all, is NNEF pooling operation consider border pixel when padding >= 1 and border = constant ?

gyenesvi commented 5 years ago

Of course it considers border values when it is constant.

TF does not consider the values (in case of same padding), which is equivalent to setting it to ignore, which is also equivalent to setting the value to -inf in Chainer.

By the way, in TF, in case of valid padding there are no padding values to consider (padding is of zero size).

syoyo commented 5 years ago

Of course it considers border values when it is constant.

So NNEF spec should be updated to describe it. And set default constant value to -inf for max_pool fragment as I pointed out, or remove border='constant from max_pool specification. For implementer's perspective, we'll need such a clear description of how to handle border values for max_pool.

TF does not consider the values (in case of same padding)

FYI Both SAME and VALID does not consider border pixels in TensorFlow(I use ave pooling for better explanation)

https://github.com/apache/incubator-mxnet/issues/10194

from __future__ import print_function

import torch
import torch.nn as nn

import tensorflow as tf

import numpy as np

x = torch.ones([1, 1, 4, 4])
pool = nn.AvgPool2d(2, padding=1)
y = pool(x)
print('torch', y.shape)
print('torch', y)

tf.enable_eager_execution()

x = tf.Variable(tf.ones([1, 4, 4, 1]))
y = tf.nn.avg_pool(x, [1, 2, 2, 1], [1, 1, 1, 1], 'SAME')

print('tf SAME', y.shape)
print('tf SAME', y)

y = tf.nn.avg_pool(x, [1, 2, 2, 1], [1, 1, 1, 1], 'VALID')

print('tf VALID', y.shape)
print('tf VALID', y)
torch torch.Size([1, 1, 3, 3])
torch tensor([[[[0.2500, 0.5000, 0.2500],
          [0.5000, 1.0000, 0.5000],
          [0.2500, 0.5000, 0.2500]]]])
tf SAME (1, 4, 4, 1)
tf SAME tf.Tensor(
[[[[1.]
   [1.]
   [1.]
   [1.]]

  [[1.]
   [1.]
   [1.]
   [1.]]

  [[1.]
   [1.]
   [1.]
   [1.]]

  [[1.]
   [1.]
   [1.]
   [1.]]]], shape=(1, 4, 4, 1), dtype=float32)
tf VALID (1, 3, 3, 1)
tf VALID tf.Tensor(
[[[[1.]
   [1.]
   [1.]]

  [[1.]
   [1.]
   [1.]]

  [[1.]
   [1.]
   [1.]]]], shape=(1, 3, 3, 1), dtype=float32)
gyenesvi commented 5 years ago

What needs to be updated in the NNEF spec? All border modes are described in section 4.3 (Border Modes heading).

Why would the default value need to be changed? Constant 0 is a valid border value, for example Caffe uses that for max and average pooling too. If the default setting does not work for your use case, it must be set to the border mode that matches the framework you are using (border = 'ignore' in the above case).

We have just noticed that the TF converter exports pooling with border = 'constant' instead of border = 'ignore', which should be the right setting. This would solve your problem, right? We will fix this soon.

syoyo commented 5 years ago

What needs to be updated in the NNEF spec?

It depends on NNEF spec. Either

(I have slightly edited issue title)

gyenesvi commented 5 years ago

In order to achieve what you want, the NNEF spec does not need to be changed, only border = 'ignore' must be used, as I explained above. What is the problem with that?

syoyo commented 5 years ago

@gyenesvi

https://www.khronos.org/registry/NNEF/specs/1.0/nnef-1.0.1.html

The spec(argmax_pool, since max_pool is redirected to argmax_pool) says

4.3.3. Index Based Sampling The operation argmax_pool returns the maximum value positions in the local window.

...

border must be one of 'ignore', 'constant', 'replicate', 'reflect', 'reflect-even'. Inference APIs are recommended to support 'constant' and 'ignore' values.

gyenesvi commented 5 years ago

So is the problem that supporting 'constant' border is recommended and you don't want to implement it? It is only a recommendation, not a must, so you can implement 'ignore' only if you want to.

syoyo commented 5 years ago

So is the problem that supporting 'constant' border is recommended and you don't want to implement it?

No. The spec declares

border must be one of 'ignore', 'constant', 'replicate', 'reflect', 'reflect-even'

...

must
When used alone, this word, or the term required, means that the definition is an absolute requirement of the specification

So we must support border ='constant' case(at least for training).

As I repeatedly said, what is the expected behavior when border = 'constant' and padding >=1 for NNEF max_pool? > @gyenesvi

My understanding is if we precisely follow the specification, we have to consider border value as 0 even for max_pool since the spec says

'constant' : the border is considered constant, currently the only supported value is 0
gyenesvi commented 5 years ago

That sentence does not say anything about what an implementation must support. It says that the valid values must be one of those listed. For example it is invalid to put border = 'hello'. It talks about when an NNEF document is valid semantically. In fact, the whole NNEF specification does not say anything about what an inference engine must support, since NNEF is a file format, there is no execution model defined for it. An implementation in this context is just a tool that can import NNEF (such as a file format converter), not necessarily execute it.

The next sentence recommends what inference APIs (or engines) should support, but they don't have to support everything listed there. It is fine to just support border = 'ignore' for pooling. However, if you wanted to support border = 'constant' as well, then the expected behavior would be to consider the zeros in the max operation (as it is described in 4.3, Border Modes heading, page 37).

syoyo commented 5 years ago

It is fine to just support border = 'ignore' for pooling. However, if you wanted to support border = 'constant' as well, then the expected behavior would be to consider the zeros in the max operation (as it is described in 4.3, Border Modes heading, page 37).

So, since default border mode is 'constant' for max_pool, expected behavior when border = 'constant' or border is omitted for max_pool is consider zero as border value.

Other people would also think this is an unexpected behavior or a bug of the spec. I'd like to hear comments from other people. @gyenesvi Is there other people working on the spec?

zoeoz commented 5 years ago

I've been getting email notifications on this thread so will make an effort to weigh in.

@syoyo, to your point if I understand correctly, using constant border with value of 0 in argmax_pool may give unexpected results depending on the actual use-case scenario. However, to @gyenesvi point, using ignore gives the correct expected result of the specific use-case scenario you mention, i.e., the outcome is equivalent to a hypothetical constant border with value of -inf.

A few observations:

  1. I don't see that the NNEF spec doesn't provide a mechanism to obtain the results you are expecting (just use border ignore).
  2. It does beg the question: is constant the most appropriate default value for the argmax_pool operation as opposed to ignore; or, even: what are the realistic use-case scenarios where border mode for argmax_pool should be anything other than ignore?
  3. Regardless of observation #2, as an implementer, you don't need to support constant border mode if you don't want to, and can even design your NNEF tool to report this as a warning or unsupported feature if you wish.
syoyo commented 5 years ago

I don't see that the NNEF spec doesn't provide a mechanism to obtain the results you are expecting

This issue happens when border ='constant' and padding >= 1.

what are the realistic use-case scenarios where border mode for argmax_pool should be anything other than ignore?

As far as I know, current ML framework does not provide border =' constant' parameter for max_pool op(At least I have confirmed it in TF, Chainer(PyTorch) and ONNX). max_pool op in each ML framework does not consider border pixels(or implicitly consider border value as -inf) for computing max pooling(but average pooling does).

Although future ML framework may consider border pixels with arbitrary value in max_pool, but it will be quite less chance to happen.

you don't need to support constant border mode if you don't want to, and can even design your NNEF tool to report this as a warning or unsupported feature if you wish

Ignoring default parameter(also ignoring the spec marked as must) isn't a good idea.

Actually I want(or must) support border = 'constant' since NNEF spec declares border = 'constant' as default setting for max_pool. And as described in the issue title, setting default constant value to -inf solves the issue, but NNEF spec declares border constant value currently supported is 0, so we have to treat border value as 0.

gyenesvi commented 5 years ago

@syoyo the NNEF working group at Khronos is working on the spec. Any company is welcome to join the working group (https://www.khronos.org/members/) if it wishes to influence the development of the standards.

Again, it is not marked as must in the spec. The spec only says it must be one of those values to be valid.

I agree that it may be strange if the default value is not supported, but it does not violate anything in the spec. It was defaulted to 'constant' to be consistent with all other operations, and since it can be overridden, we did not think it to be an issue anyways. Furthermore, the rationale was that just calculating with the border is a simpler implementation (especially in HW) than ignoring (where the window size needs to be adjusted), and since max pooling is almost always applied after relu, it does not change things anyway (meaning that if an inference engine chooses to implement max pooling with constant 0 border, it will give correct results in almost all cases).

Setting default constant to -inf is not a good solution in my opinion. First, then it would be redundant, as it would produce the same result as border = 'ignore'. Second, it would be inconsistent with the rest of the operations, so border = 'constant' would mean different thing for different operations (may or may not be an issue). Third, it may bring up further problems, like the default value would be something that currently cannot be expressed by the syntax (it is not possible to explicitly set any value to inf, that value does not exist in NNEF syntax), and some inference engines (ones using quantized integer representations) may not even be able to express the inf value.

On the other hand, if we set the default value to 'ignore' for pooling operations (which could make sense but would break backward compatibility), then some inference engines may not implement that setting (for average pooling definitely), so we are back to square one.

syoyo commented 5 years ago

@gyenesvi

Again, it is not marked as must in the spec. The spec only says it must be one of those values to be valid.

Ah, after reading NNEF spec carefully, NNEF does not mark must to the description of the behavior of border mode, so I think I've finally understand what you are saying and how NNEF spec works!

My understanding is each implementation(e.g. tool converter) are free to how to interpret border mode. We can ignore the behavior of border mode described in NNEF spec.

I thought NNEF was as strict as other Khronos specs like glTF and OpenGL/GLSL, where the spec are declared in much more strict manner and also provides schemas(glTF) or CTS(conformance test suites) for the verification of each implementation.

But it looks NNEF spec is not so strict than such existing Khronos specs, so it gives much more freedom to each implementation.

So, it should be OK to interpret border value to be -inf for max_pool, since

'constant' : the border is considered constant, currently the only supported value is 0

this sentence is also not marked as must. (more to say, I can interpret constant as ignore or whatever I want). Although this will give a different behavior for each NNEF implementations, which isn't good for the community(and implementators), but we need to accept such a situation.

max pooling is almost always applied after relu, it does not change things anyway (meaning that if an inference engine chooses to implement max pooling with constant 0 border, it will give correct results in almost all cases).

There are variant of activation functions which outputs negative value(e.g. leaky ReLU, PReLU, tanh) so we cannot assume input to max pooling is always positive. Also, other implementation such like TensorFlow-Lite has a feature to apply activation functions after pooling(FusedActivationFunctions https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/schema/schema.fbs#L331 ).

some inference engines (ones using quantized integer representations) may not even be able to express the inf value.

We can use -MAXVAL for integer types.

gyenesvi commented 5 years ago

Yes, you are correct that in some sense, an inference engine implementation is free to interpret border mode in a different way, and that NNEF is not exactly like other Khronos standards. The reason is that most Khronos standards are APIs, that describe an execution model and can be conformance tested, while NNEF is a file format without an execution model.

Basically, an NNEF file describes how a neural network was trained and how it was intended to be used. In that sense, it tells what the border mode was during training. An inference engine may use a different border mode if it is not capable of exactly matching the one described in the model, at the cost of acknowledging that it may not exactly give the same results. This is inevitable if we want to allow for implementations that use approximations.

However, the NNEF Working Group is working on a conformance testing framework for executing neural network models described in NNEF format. The purpose is to be able to some extent validate implementations of inference engines (for a subset of operations and parameter settings). In that context, it will be relevant which border mode we include in the tests, and I believe it makes sense to include only border = 'ignore' in case of max pooling.