Quantization - Githubissues

Matthias-Hoefflin commented 8 months ago

Dear Mr. Rueckauer

First of all thank you for your great work with this snntoolbox. It works perfectly if I directly convert a float32 .h5 tensorflow model with your toolbox. However, I have difficulties using the cell parameter 'quantize_weights'. If I set this to true then the error occures: AssertionError: In the [cell] section of the configuration file, 'quantize_weights' was set to True. For this to work, the layer needs to specify the fixed point number format 'Qm.f'. I do not know where I have to specify this. If I use quantize aware training and use this .h5 file then I receive the error: ValueError: Unknown layer: 'QuantizeLayer'. Please ensure you are using a keras.utils.custom_object_scope and that this object is included in the scope. See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object for details. And if i use post training quantization from tensorflow i do no longer have a .h5 file but a .tflite file.

Do you have any example how I can use quantization with your toolbox? Thank you very much.

rbodo commented 8 months ago

Hi Matthias,

Unfortunately I don't have an example that works out of the box, but I can share some scripts that I used years ago for quantization experiments with the Distiller library. I won't be able to help much with debugging at this point, but hopefully you'll find some pointers that get you going.

low_precision.zip

(A good starting point would be low_precision/distiller/run_ptq.py for post-training quantization or run_qat.py for quantization-aware training.)

Matthias-Hoefflin commented 8 months ago

Thank you very much for your fast response. I will have a look. However, in the mean time I face another issue.

I use a LeNet 5 for the fashionMNIST data set. The parsed model is really close to the ann accuracy with 94%. If i use temporal_mean_rate encoding then I achieve about 92% accuracy which is good. But now if I change to "ttfs" I have still a parsed model of the same accuracy but the SNN has an accuracy around 60%. Do you know why this happen? Below the config I use. I am really confused that it works pretty well with the temporal_mean_rate but not with ttfs. Because on the MNIST dataset the ttfs performes pretty well. Therefore, I would expect that it also should work pretty well on the fashionMNIST

config = configparser.ConfigParser() config['paths'] = { 'path_wd': WORKING_DIR, 'dataset_path': DATASET_DIR, 'filename_ann': MODEL_NAME, 'runlabel': MODELNAME+''+str(NUM_STEPS_PER_SAMPLE) } config['tools'] = { 'evaluate_ann': True, 'parse': True, 'normalize': True, 'simulate': True, 'convert' : True } config['conversion'] = { 'spike_code': 'ttfs', 'softmax_to_relu':True } config['simulation'] = { 'simulator': 'INI', 'duration': NUM_STEPS_PER_SAMPLE, 'num_to_test': NUM_TEST_SAMPLES, 'batch_size': BATCH_SIZE, 'keras_backend': 'tensorflow' } config['output'] = { 'verbose': 2, 'plot_vars': { 'input_image', 'spiketrains', 'spikerates', 'spikecounts', 'operations', 'normalization_activations', 'activations', 'correlation', 'v_mem', 'error_t' }, 'overwrite': True }

rbodo commented 7 months ago

Remember that the TTFS encoding only uses a single spike per neuron to represent a floating point activation value. Temporal mean rate uses many times that number of spikes so it can achieve higher precision / accuracy more easily at the cost of increased computation. To illustrate some of the issues when using a single spike: In our coding scheme, large activations result in fast spikes, small activations in slow spikes. So if a neuron receives as input both slow and fast spikes, it may fire an output spike due to the fast input spike without waiting for the slow input spike (which could have inhibited the firing). We explain it in a bit more detail in the paper.

Unfortunately, MNIST is not a good predictor for success of a method. Fashion MNIST was designed explicitly to be harder than MNIST while still small and easy to handle. We've struggled to make TTFS work with CIFAR for example, so I'm not surprised that the accuracy dropped for Fashion MNIST. There were a couple of things we tried to improve performance - a dynamic threshold, or training with quantized / clipped activations.

matthiashoefflin commented 7 months ago

Thank you for your response. Oke so you would no expect a bug or something like that?

I was confused, because a paper which uses your toolbox noted an accuracy of 88.9% with TTFS on the fashionMNIST dataset. Therefore, I assumed that there is some mistake.

rbodo commented 7 months ago

I don't think it is a bug, more likely a configuration issue / finding the right hyperparameters. Or training the ANN in a certain way before conversion (it usually helps if the activations of all layers are distributed about equally between 0 and 1, or are clipped or quantized - because the converted SNN effectively quantizes and clips the spikerates due to a finite simulation resolution and maximum firing rate). Perhaps you could ask the authors of that paper to share their config?

matthiashoefflin commented 7 months ago

Thanks for your help.

NeuromorphicProcessorProject / snn_toolbox

Quantization #142