larq / compute-engine

Highly optimized inference engine for Binarized Neural Networks
https://docs.larq.dev/compute-engine
Apache License 2.0
243 stars 35 forks source link

`convert_keras_model()` does not work as expected for BinaryDenseNet37 Dilated and XNORNet #744

Open ZhanqiuHu opened 2 years ago

ZhanqiuHu commented 2 years ago

I tried using python 3.6 + LCE 0.6.2 and python 3.7/3.8 + LCE 0.7.0 to run the following code, and the tflite file generated has unexpected sizes:

For python 3.6 + LCE 0.6.2: XNOR tflite: 88.9 MB BinaryDenseNet37 tflite: 25.6 MB

For python 3.7/3.8 + LCE 0.7.0: XNOR tflite: 235.2 MB BinaryDenseNet37 tflite: 5.4 MB (this looks normal)

Do you know what is causing this and what will be a solution? Thanks a lot!

import tensorflow as tf
import larq_zoo as lqz
import larq as lq

input_tensor = tf.keras.layers.Input(shape=(224, 224, 3))
# model = lqz.literature.BinaryDenseNet37Dilated(input_tensor=input_tensor, weights="imagenet")
model = lqz.literature.XNORNet(input_tensor=input_tensor, weights="imagenet")

lq.models.summary(model, print_fn=None, include_macs=True)

import os
path = os.path.join(os.getcwd(), './tflite_models')
if not os.path.exists(path):
    os.makedirs(path)
with open(os.path.join(path,name+'.tflite'), 'wb') as flatbuffer_file:
    flatbuffer_bytes = lce.convert_keras_model(model)
    flatbuffer_file.write(flatbuffer_bytes)
lgeiger commented 2 years ago

It looks like dilated convolutions weren't properly converted in LCE 0.6.2 which was based on the TensorFlow 2.5 converter, this seems to fixed in LCE 0.7.

With respect to XNOR-Net: It looks like there was a regression in 0.7 or in the underlying TensorFlow converter which leads to incorrect fusion of the xnor weights quantizer in one of the layers: Screenshot 2022-07-15 at 17 34 00

I would need to take a closer look at this conversion issue next week, for now I'd recommend sticking with 0.6.2 for converting XNORNet.