Closed ZoltanV-V closed 2 years ago
Hi,
The nxsdk package is not part of the public nxsdk_modules_ncl repo, you can only find it on the remote INRC machine you are using to start jobs on Loihi. Check what version of nxsdk you have installed there (there are zip files available in the local file system corresonding to the various releases). In nxsdk-1.0.0 for instance, the module exists. Don't remember now when it got moved to the current location, probably around nxsdk-0.9.5.
Thank you, yes, I was able to find these files (I am using 0.9.5 version of nxsdk). I am not sure if I should be starting a new issue for this but now when I try to convert my ANN to SNN it gives me a system error: Unknown Opcode
I already checked that the versions of keras and tf on my local machine from where I generated the models as well as on loihi are the same:
Keras == 2.2.4 Tensorflow == 1.12.0
And I also made sure to use the same method of loading/saving (i.e. using tf.keras.models.load_model vs. Keras.models.load_model) in both machines and yet it still fails on Loihi. I am not sure why?
If possible, please update your software setup (nxsdk >= 1.0.0, and tensorflow >= 2.2.0. There is no need to install keras separately, it is included in the later versions of tensorflow. In fact, you should ensure that the ANN you create imports keras via from tensorflow import keras
.)
Hope that will solve the issue.
Thank you so much!! I managed to upgrade to nxsdk 1.0.0, python3.8, tf == 2.2.0 and it actually reads in the model now. However there is still an error in the parsing process and I've copied the traceback:
Parsing input model... Skipping layer InputLayer. Skipping layer Lambda. Parsing layer Reshape. Parsing layer Reshape. Parsing layer Dense. Using activation relu. Skipping layer Lambda. Skipping layer Lambda. Skipping layer Activation. Parsing layer Dense. Traceback (most recent call last): File "ann_to_snn.py", line 108, in
main(config_filepath) File "/home/lib/python3.8/site-packages/snntoolbox/bin/run.py", line 31, in main run_pipeline(config) File "/home/lib/python3.8/site-packages/snntoolbox/bin/utils.py", line 88, in run_pipeline model_parser.parse() File "/home/lib/python3.8/site-packages/snntoolbox/parsing/utils.py", line 246, in parse inbound = self.get_inbound_names(layer, name_map) File "/home/lib/python3.8/site-packages/snntoolbox/parsing/utils.py", line 411, in get_inbound_names inb_idxs = [name_map[str(id(inb))] for inb in inbound] File "/home/lib/python3.8/site-packages/snntoolbox/parsing/utils.py", line 411, in inb_idxs = [name_map[str(id(inb))] for inb in inbound] KeyError: '140470844937744'
But I'm not sure what to make of the KeyError?
The toolbox isn't able to parse the Lambda layers. Not sure what you are using them for, but consider replacing them if possible.
Hi, so I replaced the Lambda layers with tf.keras.layers.Layers module to create a custom Layer but the error persists. Does snntoolbox not support creating custom Layers at all?
I attached the traceback here as well:
Parsing input model... Skipping layer InputLayer. Parsing layer Reshape. Parsing layer Reshape. Parsing layer Reshape. Parsing layer Dense. Using activation relu. Skipping layer LinearExcitLayer. Skipping layer Activation. Parsing layer Dense. Traceback (most recent call last): File "ann_to_snn.py", line 108, in
main(config_filepath) File "/home/lib/python3.8/site-packages/snntoolbox/bin/run.py", line 31, in main run_pipeline(config) File "/home/lib/python3.8/site-packages/snntoolbox/bin/utils.py", line 88, in run_pipeline model_parser.parse() File "/home/lib/python3.8/site-packages/snntoolbox/parsing/utils.py", line 246, in parse inbound = self.get_inbound_names(layer, name_map) File "/home/lib/python3.8/site-packages/snntoolbox/parsing/utils.py", line 411, in get_inbound_names inb_idxs = [name_map[str(id(inb))] for inb in inbound] File "/home/lib/python3.8/site-packages/snntoolbox/parsing/utils.py", line 411, in inb_idxs = [name_map[str(id(inb))] for inb in inbound] KeyError: '139929077085616'
Please check this section in the documentation on adding a new layer. The main question to consider before doing this work is whether / how your custom layer will work in the spiking domain.
Thank you for the link! I have added a parser for my custom layer by modifying config_defaults, keras_input_lib.py, and temporal_mean_rate_tensorflow.py and it looks like it finished parsing the model:
LAYER TYPE: LinearExcitLayer Parsing layer LinearExcitLayer. ELSE, layer, name_map: {'name': 'linear_excit_layer', 'trainable': True, 'dtype': 'float32', 'out_features': 1000, 'stdv': 0.009021097956087904} {'139874854591216': 0, '139874833220272': 1, '139874854591984': 2} INBOUND: ['139874854591984'] INBOUND: 1 INB_IDXS: [2] LAYER TYPE: Activation Skipping layer Activation. LAYER TYPE: Dense Parsing layer Dense. ELSE, layer, name_map: {'name': '21', 'trainable': True, 'dtype': 'float32', 'units': 1000, 'activation': 'linear', 'use_bias': False, 'kernel_initializer': {'class_name': 'Zeros', 'config': {}}, 'bias_initializer': {'class_name': 'Zeros', 'config': {}}, 'kernel_regularizer': None, 'bias_regularizer': None, 'activity_regularizer': None, 'kernel_constraint': None, 'bias_constraint': None} {'139874854591216': 0, '139874833220272': 1, '139874854591984': 2, '139874833221616': 3} INBOUND: ['139874833221616'] INBOUND: 1 INB_IDXS: [3] Using activation relu. LAYER TYPE: LinearExcitLayer Parsing layer LinearExcitLayer. ELSE, layer, name_map: {'name': 'linear_excit_layer_1', 'trainable': True, 'dtype': 'float32', 'out_features': 1000, 'stdv': 0.03162277660168379} {'139874854591216': 0, '139874833220272': 1, '139874854591984': 2, '139874833221616': 3, '139874833222480': 4} INBOUND: ['139874833222480'] INBOUND: 1 INB_IDXS: [4] LAYER TYPE: Activation Skipping layer Activation. LAYER TYPE: Dense Parsing layer Dense. ELSE, layer, name_map: {'name': '25', 'trainable': True, 'dtype': 'float32', 'units': 24, 'activation': 'linear', 'use_bias': False, 'kernel_initializer': {'class_name': 'Zeros', 'config': {}}, 'bias_initializer': {'class_name': 'Zeros', 'config': {}}, 'kernel_regularizer': None, 'bias_regularizer': None, 'activity_regularizer': None, 'kernel_constraint': None, 'bias_constraint': None} {'139874854591216': 0, '139874833220272': 1, '139874854591984': 2, '139874833221616': 3, '139874833222480': 4, '139874833223536': 5} INBOUND: ['139874833223536'] INBOUND: 1 INB_IDXS: [5] Using activation linear.
Building parsed model...
But when it tries to build the parsed model, it fails in the it's complaining of a missing module called keras_wiring (I think it fails because there is no 'tensorflow.python.keras.layers.core.LinearExcit (the name of my custom layer is LinearExcit). Here's the entire traceback:
Traceback (most recent call last): File "ann_to_snn.py", line 109, in <module> main(config_filepath) File "/home/lib/python3.8/site-packages/snntoolbox/bin/run.py", line 31, in main run_pipeline(config) File "/home/lib/python3.8/site-packages/snntoolbox/bin/utils.py", line 89, in run_pipeline parsed_model = model_parser.build_parsed_model() File "/home/lib/python3.8/site-packages/snntoolbox/parsing/utils.py", line 816, in build_parsed_model import keras_rewiring ModuleNotFoundError: No module named 'keras_rewiring'
Where is the keras_rewiring module located?
You shouldn't need the keras_rewiring module; this else clause is not meant to be reached when adding a custom layer. You reach it here because your custom layer has a type that's not one of the standard keras layer types. Our assumption here is that even custom layers will be standard keras layers, because you should always be able to use the keras.layers.Layer
class as base for your custom layer. If that is not possible in your case for some reason, then add another condition at line 809 to get your custom layer class.
My custom layer is using keras.layers.Layer as the base:
class LinearExcitLayer(Layer):
def __init__(self, out_features=0, stdv = 0, name=None, **kwargs):
super(LinearExcitLayer, self).__init__(name=name)
self.out_features = out_features
self.stdv = stdv
bias = tf.random.uniform([out_features], minval=-stdv, maxval=stdv)
self.bias = tf.Variable(bias)
super(LinearExcitLayer, self).__init__(**kwargs)
def get_config(self):
config = super(LinearExcitLayer, self).get_config()
config.update({'out_features': self.out_features})
config.update({'stdv': self.stdv})
return config
def call(self, inputs):
return tf.add(inputs, self.bias)`
But it fails at getattr(keras.layers, LinearExcitLayer) which is why it is trying to execute the else statement. So in load_model, I am including the custom_dict = {'LinearExcitLayer': LinearExcitLayer}. It seems to parse it but when it gets to trying to build the parsed model, that's where it has issues.
Your init function seems to initialize the super class twice, maybe fixing that solves it.
Otherwise please add another elif to line 809 to check for your custom layer. In the clause you can then simply do an import LinearExcitLayer as parsed_layer
. Unfortunately I can't check the code now myself.
Thank you so much for the response, doing both fixes resolved the issue and it now successfully builds and tests the parsed model. However now when it builds the spike model, it gives me yet another error. Here's the full traceback:
Building spiking model...
Normalizing thresholds.
Using 100 samples for normalization.
INFO: Need ['0.00', '0.00', '0.00', '0.00', '0.00'] GB for layer activations.
input_0
Maximum increase in compartment voltage per timestep: 255.
Setting threshold of layer input_0 to 255 and scaling biases of subsequent layer by 1.0
Weight mantissa and exponent for subtractive-reset are 255 and 0, respectively
00Reshape_None
01Reshape_12288
02Dense_1000
Parameter scale: 2423.5357350838513
Maximum increase in compartment voltage per timestep: 44853.
Traceback (most recent call last):
File "ann_to_snn.py", line 99, in <module>
main(config_filepath)
File "/home/lib/python3.8/site-packages/snntoolbox/bin/run.py", line 31, in main
run_pipeline(config)
File "/home/lib/python3.8/site-packages/snntoolbox/bin/utils.py", line 127, in run_pipeline
spiking_model.build(parsed_model, **testset)
File "/home/lib/python3.8/site-packages/snntoolbox/simulation/utils.py", line 435, in build
self.preprocessing(**kwargs)
File "/home/nxsdk_modules_ncl/snntoolbox/nx_backend.py", line 935, in preprocessing
temp = self.normalize_nx_model(**kwargs)
File "/home/nxsdk_modules_ncl/snntoolbox/nx_backend.py", line 1313, in normalize_nx_model
thresh_mant, thresh_exp = to_mantexp(
File "/home/nxsdk_modules_ncl/snntoolbox/nx_backend.py", line 1501, in to_mantexp
assert np.all(exp <= exp_max)
AssertionError
I just realized that you are using Loihi as backend, not the builtin simulator. My previous answer on adding a custom layer does not quite apply then: The last step of creating the new layer in the tensorflow backend would have to be done in the Loihi backend instead. That however is much more involved. Given that the function of your LinearExcitLayer is very simple, I would recommend a different approach: Use the existing DenseLayer but instead of trained weights use the identity matrix, so the input is not modified and only the biases are added.
Thank you for the response. Do you mean completely replacing the LinearExcitLayer with a Dense layer instead to eliminate having to create the custom layer in the Loihi Backend?
Use the existing DenseLayer but instead of trained weights use the identity matrix, so the input is not modified and only the biases are added.
By this do you mean when parsing the Dense layer, change it so instead of list(layer.get_weights())
it would be using the identity matrix instead?
If I understand the purpose of this particular layer, you should not need to change anything in the parser or create a new custom layer. All you need is set up your original network (ANN, before conversion) with a standard Dense layer where the weights are set to the identity matrix.
I noticed that you are not using a non-linearity in this layer. Be aware that the SNN (on Loihi or the toolbox) automatically applies a rectifying non-linearity. This is a simple consequence of using a (possibly leaky) integrate and fire neuron model. You only get positive neuron activity (e.g. spike rate), hence a rectifier. If it's non-leaky, it will be a ReLU. So there will be a discrepancy between your LinearExcitLayer in the ANN and when you run it as an SNN.
Sorry, I am still really confused by what you are suggesting. Do you mean I should change the architecture of my ANN completely (even before it's fed into the toolbox)? That is, removing both custom LinearExcitLayer() that I have in the architecture currently and replacing it with a standard Dense layer and saving a new .h5 model out of that? Because unless I do that, won't I still need to modify the toolbox parser later on, otherwise it will give me the same error I was getting previously where it did not recognize keras.layers.Layer when parsing the model?
Yes, that is what I'm suggesting. It should be considerably less work than modifying the parser and (more importantly) the Loihi NxTF compiler to support your new layer. However, before you start doing that, please consider my other comment above about the missing non-linearity. Such a mismatch can lead to unacceptably low performance of the SNN.
Thank you for the clarification! I have made all the changes you've suggested (replaced both of the LinearExcit custom layers with Dense layers and using non-linearity) but I am still getting the exact same error from last time:
File "/home/nxsdk_modules_ncl/snntoolbox/nx_backend.py", line 1501, in to_mantexp assert np.all(exp <= exp_max) AssertionError
Which happens when it's doing the final conversion to the Spiking layers. I have gone through utils.py and made sure to revert any code I changed in the parser but it still errors out. I'm not sure what could be causing it because all my layers are now whatever's built-in in Tensorflow.keras?
Oh! I resolved the AssertionError above and it looks like it has actually compiled the NxModel (!!) but it fails with this error:
INFO:DNN: Loading and mapping pre-compiled layers from /home/log/gui/test/model_dumps/mappables.
INFO:DNN: Mapping layer 07Dense_24.
INFO:DNN: Mapping layer 06Dense_1000.
INFO:DNN: Mapping layer 05Dense_1000.
INFO:DNN: Mapping layer 04Dense_1000.
INFO:DNN: Map time: 84.53067326545715 s, memory: 4497.875 MB
Traceback (most recent call last):
File "ann_to_snn.py", line 108, in
Great to hear you got a step further.
The log says: "INFO:DNN: Loading and mapping pre-compiled layers from /home/log/gui/test/model_dumps/mappables."
I think the error is due to the fact that the compiler tries to load old tmp files from the model_dumps directory to save time. But if these are corrupt or incomplete, the compilation will fail. Please delete the model_dumps folder and try again.
Thank you, it looks like that got rid of the error. But now I wonder if my model is simply too big to compile into a spiking model because this is the new error:
INFO:DNN: Finding best partition for 07Dense_24. x
INFO:DNN: Layer 07Dense_24 was distributed across 1 core. INFO:DNN: Finding best partition for 06Dense_1000. .......................x
INFO:DNN: Layer 06Dense_1000 was distributed across 24 cores. INFO:DNN: Finding best partition for 05Dense_1000. .......................x
INFO:DNN: Layer 05Dense_1000 was distributed across 24 cores. INFO:DNN: Finding best partition for 04Dense_1000. .......................x
INFO:DNN: Layer 04Dense_1000 was distributed across 24 cores.
INFO:DNN: Finding best partition for 03Dense_1000.
...............................................................Excluded the following partition candidates:
numDestinationGroups: 0
coreSizeInterleaved: 0
numSynFmts: 0
synMemPerAxon: 0
numSynMemWords: 60
numInputAxons: 2
numOutputAxons: 0
Traceback (most recent call last):
File "ann_to_snn.py", line 108, in
That is likely, since every one of your hidden layers comes with a million connections.
Aside from reducing the layer size, you may see an improvement when changing the syapse encoding https://github.com/intel-nrc-ecosystem/models/blob/5e5b6521b8bcab6e3576e177784baaed57fd5280/nxsdk_modules_ncl/dnn/src/dnn_layers.py#L796
After reducing the model size (total params went down from 13M to Total params: 1,261,624), it is now failing to compile the spiking model with an AssertionError:
00Permute_3x64x64
01Reshape_None
02Reshape_12288
03Dense_100
Parameter scale: 706.6131428254955
Maximum increase in compartment voltage per timestep: 31415.
Traceback (most recent call last):
File "ann_to_snn.py", line 108, in
Is there something in config
that needs to be modified when using a smaller model (aside from the obvious ones such as model name
and working_dir
)? Otherwise I am not sure why it would be failing when the general architecture is the same as my previous model (i.e no new layers/ custom layers).
To exclude the possibility of old files being loaded, please make sure you are starting from a clean working_dir
(i.e. no subfolders like normalization
or model_dumps
present there).
Then, in case you are not doing so already, I think you might need to normalize the input data manually to the range [0, 1]. It seems to me that the error might be caused by the neuron input exceeding what can be represented with the limited precision available on Loihi. Normalizing the input should help here. (Of course the original ANNl would need to be trained on this new data representation.)
My dataset (x_test.npz) is already normalized between [0,1]:
x_test['arr_0']
array([[[[
[[0. , 0. , 0. ],
[0.22352941, 0.22352941, 0.22352941],
[0.25882354, 0.25882354, 0.25882354],
...,
[0.35686275, 0.35686275, 0.35686275],
[0.40784314, 0.40784314, 0.40784314],
[0.46666667, 0.46666667, 0.46666667]],
[[0.27058825, 0.27058825, 0.27058825],
[0.12156863, 0.12156863, 0.12156863],
[0.42745098, 0.42745098, 0.42745098],
...,
[0.39215687, 0.39215687, 0.39215687],
[0.3882353 , 0.3882353 , 0.3882353 ],
[0.4509804 , 0.4509804 , 0.4509804 ]],
...,
I have also created a new directory for the smaller model but it is still giving me the same AssertionError
Some things to check / try in your config:
numWeightBits
in connection_kwargs
set to the maximum (8). Not specifying this parameter would also use the maximum by default.desired_threshold_to_input_ratio
in loihi
is set to 1.activation_percentile
in normalization
to a lower value (the default is 99.999 I think; you could try 99.99, 99.9, 99, 95.)Thank you for all the suggestions. I already had the numWeightBits
set to 8 and the desired_threshold_to_input_ratio
to 1. I did try playing around with the activation_percentile
but it isn't lowering the value of DVDT_MAX and I think that is what's causing the error to come on (printed out a traceback and copied it below). In my other (larger) model, that value doesn't really get higher than 1041.00. I'm not sure what is causing this though. I tried resaving the model and it is still the same.
00Permute_3x64x64
01Reshape_None
02Reshape_12288
03Dense_100
Parameter scale: 62720.67643839024
Maximum increase in compartment voltage per timestep: 35761.
AP, DVDT_MAX, MANT MAX, exp_max: 70.0, 35761.63515625, 256, 7
x: 35761.63515625 ()
35761.63515625
R: 139.69388732910156
7.12612508301364 8.0
EXP: 8
EXP MAX: 7
Traceback (most recent call last):
File "ann_to_snn_100.py", line 113, in
Sorry, for some reason I didn't see your reply until now.
If the dvdt_max doesn't change at all when you change the activation_percentile, then please set a breakpoint at this line to check whether the activation_percentile parameter is indeed changing as you set it in your config.
Aside from that, I don't think we're dealing with a bug here - it seems the activations in that particular layer simply are too large for the limited precision available on chip. One way to solve it is to train your model using a regularizer on the activations to ensure they don't go so high.
Hello, I recently updated my snntoolbox to 0.6.0 to convert TF models to SNN and I am getting this error:
"from nxsdk.arch.n2a.compiler.tracecfggen.tracecfggen import TraceCfgGen
ImportError: No module named 'nxsdk.arch.n2a.compiler.tracecfggen' "
I went to Intel NRC Ecosystem repo where I found nxsdk_modules_ncl but I cannot find the module mentioned in the error above.