AssertionError: Not equal to tolerance rtol=0.001, atol=1e-05 ONNX model could not be ported to Keras.

jimzhou112 commented 3 years ago

I'm using snn_toolbox to convert a PyTorch model into SNN for simulation or deployment. When I invoke the toolbox within my python script (snntoolbox.bin.run.main(path_to_config_file)), I get the following error and stack trace:

Initializing INI simulator...

Loading data set from '.npz' files in /content/drive/My Drive/snn_toolbox.

Pytorch model was successfully ported to ONNX.
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-5-ea0a1c7d565a> in <module>()
    166 ###################
    167 
--> 168 main(config_filepath)

4 frames
/usr/local/lib/python3.7/dist-packages/numpy/testing/_private/utils.py in assert_array_compare(comparison, x, y, err_msg, verbose, header, precision, equal_nan, equal_inf)
    838                                 verbose=verbose, header=header,
    839                                 names=('x', 'y'), precision=precision)
--> 840             raise AssertionError(msg)
    841     except ValueError:
    842         import traceback

AssertionError: 
Not equal to tolerance rtol=0.001, atol=1e-05
ONNX model could not be ported to Keras. Output difference: 
Mismatched elements: 23 / 23 (100%)
Max absolute difference: 158.25595
Max relative difference: 21.9259
 x: array([[  52.707054,   40.616642,   43.44081 ,   73.01086 ,   13.085248,
        -116.642006, -119.265816, -126.4653  ,  -86.646416,   14.50121 ,
          41.220455,   14.792315,  -30.303583,  -42.41754 ,    8.07667 ,...
 y: array([[ 20.35986 ,  31.249834, -12.284489,  23.593775,  29.186235,
         32.922356,  35.387695,  31.790651,  -8.836283, -33.100048,
         -9.865446, -42.051174, -57.875008, -25.554855, -28.467241,...

My config file:

[paths]
path_wd = /content/drive/My Drive/snn_toolbox
dataset_path = /content/drive/My Drive/snn_toolbox
filename_ann = pytorch_cnn

[tools]
evaluate_ann = True
normalize = True

[simulation]
simulator = INI
duration = 50
num_to_test = 100
batch_size = 50
keras_backend = tensorflow

[input]
model_lib = pytorch

[output]
plot_vars = {'v_mem', 'correlation', 'spiketrains', 'spikerates', 'activations', 'error_t'}

Thanks in advance for any help

rbodo commented 3 years ago

Hi,

The way pytorch integration works is that we first export the pytorch model to onnx format and then use the onnx2keras library to transform the onnx model to keras. At each step we check whether the transformation was successful by running the new model on some dummy input and compare the output against that of the previous version. Your log tells me that the first step worked but the second (onnx to keras) didn't. Because this transformation is handled by the onnx2keras tool, you'd have to check their documentation to see if your network architecture might contain something that isn't supported by them.

jimzhou112 commented 3 years ago

Hello,

Thank you for the insights. However, I don't see any discrepancies between my model architecture and their supported architecture. My model is a simple one layer CNN composed of a conv, relu, and fc. Below is the model in PyTorch

class Model(nn.Module):

    def __init__(self):
        super(Model, self).__init__()

        # The input_shape field is required by SNN toolbox.
        self.input_shape = (1, 7, 7)

        self.conv = nn.Conv2d(1, 6, kernel_size=3, stride=1)
        self.relu = nn.ReLU()
        self.fc1 = nn.Linear(150, 23)

    def forward(self, x):
        x = self.relu(self.conv(x))
        x = x.view(x.size(0), -1)
        x = self.fc1(x)
        return x

I've also attached my python script (based off this example). Do you have any other insights as to how I can resolve the error? Thanks for all the help.

main.py.zip

rbodo commented 3 years ago

Didn't see anything obviously off in the code.

Since the model is so simple, a way to debug / circumvent this could be to instantiate the model in keras instead and transfer the weights manually; then see if the conversion goes through.

Sorry that I can't really help debug the assertion error itself right now.

NeuromorphicProcessorProject / snn_toolbox

AssertionError: Not equal to tolerance rtol=0.001, atol=1e-05 ONNX model could not be ported to Keras. #96