dlshriver / dnnv

A Framework for Deep Neural Network Verification
https://docs.dnnv.org
MIT License
56 stars 18 forks source link

Help debugging: What is wrong with network and property #102

Open cherrywoods opened 2 years ago

cherrywoods commented 2 years ago

Hallo, I am running into problems using DNNV that probably stem from my input network and property, but I can't find out what the issue is. I isolated the parts that looked suspicious to me, but couldn't reproduce the error. I would be thankful for any pointers what might be wrong with my network.

The network and property are in this file: inputs.zip

DNNV shows network and property like this:

Verifying property:
Forall(x, ((([[ 0.   0.  ... -0.001 -0.001]] <= x) & (x <= [[1.  1.  ... 0.001 0.001]])) ==> (N(x)[0] >= 0.0)))

Verifying Networks:
N:
Input_0                         : Input([ 1 12], dtype=float32)
Gemm_0                          : Gemm(Input_0, ndarray(shape=(12, 12)), ndarray(shape=(12,)), transpose_a=0, transpose_b=1, alpha=1.000000, beta=1.000000)
Sub_0                           : Sub(ndarray(shape=(12,)), Gemm_0)
Relu_0                          : Relu(Sub_0)
Add_0                           : Add(Gemm_0, Relu_0)
Sub_1                           : Sub(Add_0, ndarray(shape=(12,)))
Relu_1                          : Relu(Sub_1)
Sub_2                           : Sub(Add_0, Relu_1)
Gemm_1                          : Gemm(Sub_2, ndarray(shape=(120, 12)), ndarray(shape=(120,)), transpose_a=0, transpose_b=1, alpha=1.000000, beta=1.000000)
Relu_2                          : Relu(Gemm_1)
Gemm_2                          : Gemm(Relu_2, ndarray(shape=(120, 120)), ndarray(shape=(120,)), transpose_a=0, transpose_b=1, alpha=1.000000, beta=1.000000)
Relu_3                          : Relu(Gemm_2)
Gemm_3                          : Gemm(Relu_3, ndarray(shape=(120, 120)), ndarray(shape=(120,)), transpose_a=0, transpose_b=1, alpha=1.000000, beta=1.000000)
Relu_4                          : Relu(Gemm_3)
Gemm_4                          : Gemm(Relu_4, ndarray(shape=(4, 120)), [-0.0484903  0.0484905 -0.0484903  0.0484905], transpose_a=0, transpose_b=1, alpha=1.000000, beta=1.000000)
Split_0                         : Split(Gemm_4, axis=-1, split=[2 2])
OutputSelect_0                  : OutputSelect(Split_0, 0)
Reshape_0                       : Reshape(OutputSelect_0, [-1  2])
OutputSelect_1                  : OutputSelect(Split_0, 1)
Reshape_1                       : Reshape(OutputSelect_1, [-1  2])
Sub_3                           : Sub(Reshape_0, Reshape_1)
Flatten_0                       : Flatten(Sub_3, axis=1)
Sub_4                           : Sub(0.0, Flatten_0)
Mul_0                           : Mul(Flatten_0, 2.0)
Relu_5                          : Relu(Mul_0)
Add_1                           : Add(Sub_4, Relu_5)
Unsqueeze_0                     : Unsqueeze(Add_1, axes=[1])
MaxPool_0                       : MaxPool(Unsqueeze_0)
Shape_0                         : Shape(MaxPool_0)
Slice_0                         : Slice(Shape_0, [0], [0], axes=[0])
Concat_0                        : Concat(['Slice_0', '[-1]'])
Reshape_2                       : Reshape(MaxPool_0, Concat_0)
Sub_5                           : Sub(0.001, Reshape_2)

This is the remaining DNNV output including the exception trace:

2022-09-09 18:14:34.634513: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-09-09 18:14:34.662944: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
2022-09-09 18:14:34.662965: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2022-09-09 18:14:34.663206: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
INFO     2022-09-09 18:14:34,671 (dnnv.verifiers.common.reductions.iopolytope.reduction) CONJUNCTION: ((((-1 * x)) <= [[-0.  -0.  ... 0.001 0.001]]) & (((1 * x)) <= [[1.  1.  ... 0.001 0.001]]) & (((1 * N(x)[0])) < 0))
Traceback (most recent call last):
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 58, in wrapped_func
    self._cache[func] = func(*args, **kwargs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 437, in gemm_func
    result = operation.alpha * tf.matmul(
  File ".../lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File ".../lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 7107, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [1,2], In[1]: [1,1] [Op:MatMul]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ".../bin/dnnv", line 8, in <module>
    sys.exit(_main())
  File ".../lib/python3.9/site-packages/dnnv/__main__.py", line 113, in _main
    return main(*cli.parse_args())
  File ".../lib/python3.9/site-packages/dnnv/__main__.py", line 74, in main
    result, cex = verifier.verify(phi, **params)
  File ".../lib/python3.9/site-packages/dnnv/verifiers/common/base.py", line 102, in verify
    return cls(phi, **kwargs).run()
  File ".../lib/python3.9/site-packages/dnnv/verifiers/common/base.py", line 137, in run
    subproperty_result, cex = self.check(subproperty)
  File ".../lib/python3.9/site-packages/dnnv/verifiers/common/base.py", line 106, in check
    executor_inputs = self.build_inputs(prop)
  File ".../lib/python3.9/site-packages/dnnv/verifiers/nnenum/__init__.py", line 50, in build_inputs
    prop.op_graph.simplify().export_onnx(onnx_model_file.name)
  File ".../lib/python3.9/site-packages/dnnv/nn/graph.py", line 36, in simplify
    return simplify(self.copy())
  File ".../lib/python3.9/site-packages/dnnv/nn/transformers/simplifiers/__init__.py", line 69, in simplify
    simplified_graph = OperationGraph(dnn.walk(simplifier))
  File ".../lib/python3.9/site-packages/dnnv/nn/graph.py", line 24, in walk
    result.append(visitor.visit(output_operation))
  File ".../lib/python3.9/site-packages/dnnv/nn/transformers/simplifiers/base.py", line 45, in visit
    operation = simplifier.visit(operation)
  File ".../lib/python3.9/site-packages/dnnv/nn/transformers/simplifiers/base.py", line 26, in visit
    result = super().visit(operation)
  File ".../lib/python3.9/site-packages/dnnv/nn/transformers/base.py", line 16, in visit
    result = visitor(operation)
  File ".../lib/python3.9/site-packages/dnnv/nn/transformers/simplifiers/convert_add.py", line 26, in visit_Add
    input_details = OperationGraph([input_op]).output_details[0]
  File ".../lib/python3.9/site-packages/dnnv/nn/graph.py", line 60, in output_details
    output = self(
  File ".../lib/python3.9/site-packages/dnnv/nn/graph.py", line 135, in __call__
    output = tf_func(*x, **kwargs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 25, in func
    output = output_func(*inputs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 58, in wrapped_func
    self._cache[func] = func(*args, **kwargs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 649, in reshape_func
    x, shape = _concretize([x_, shape_], inputs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 40, in _concretize
    concrete_values.append(variable(*inputs))
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 58, in wrapped_func
    self._cache[func] = func(*args, **kwargs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 548, in maxpool_func
    x = _concretize([x_], inputs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 40, in _concretize
    concrete_values.append(variable(*inputs))
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 58, in wrapped_func
    self._cache[func] = func(*args, **kwargs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 836, in unsqueeze_func
    x, axes = _concretize([x_, axes_], inputs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 40, in _concretize
    concrete_values.append(variable(*inputs))
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 58, in wrapped_func
    self._cache[func] = func(*args, **kwargs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 99, in add_func
    a, b = _concretize([a_, b_], inputs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 40, in _concretize
    concrete_values.append(variable(*inputs))
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 58, in wrapped_func
    self._cache[func] = func(*args, **kwargs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 637, in relu_func
    x = _concretize([x_], inputs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 40, in _concretize
    concrete_values.append(variable(*inputs))
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 67, in wrapped_func
    raise TensorflowConverterError(
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 58, in wrapped_func
    self._cache[func] = func(*args, **kwargs)
  File ".../lib/python3.9/site-packages/dnnv/nn/converters/tensorflow.py", line 437, in gemm_func
    result = operation.alpha * tf.matmul(
  File ".../lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File ".../lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 7107, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
dnnv.nn.converters.tensorflow.TensorflowConverterError: InvalidArgumentError: 

I tested:

I am using the latest commit on the develop branch.

I am running:

dnnv prop.py --seed 9129849 -v --save-violation "cx.npy" --network "N" net.onnx --nnenum

Best regards, David

dlshriver commented 1 year ago

Sorry for the late reply, my life is a bit busy right now. Thanks for another interesting bug. I'm not quite sure where the issue is yet, but my guess is that we are incorrectly inferring the shape of some tensor which causes a future simplification to be incorrect. This may take some time to figure out.

cherrywoods commented 1 year ago

Thank you for taking your time to answer me and for the update. I have tried to directly get verifiers running on my inputs, but without much luck. Therefore, the issue is still relevant.