Hi there, I'm now working to benchmark my BNN with LCE. Following the instruction of larq and lce, I first implement my BNN with Keras and convert it to .tflite by LCE. However, the actual size of my .tflite is much bigger than the theoretical expected one, reported by lq.models.summary(). I find that the extra size (about 60K) is introduced by a non-parameter custom keras layer in my model, which is show below. (This custom layer is used as a substitute for Sparse-Dense Matrix Multiplication.)
class MyLayer(Layer):
def __init__(self, **kwargs):
super(MyLayer, self).__init__(**kwargs)
def build(self, input_shape):
self.build = True
def call(self, inputs, mask=None):
data= inputs[0] # A dense tensor with size of [N, D]
idx = inputs[1] # A dense tensor with size of [N, M]
weight = inputs[2] # A dense tensor with size of [N, M]
idx_sparse = tf.sparse.from_dense(tf.cast(idx, dtype=tf.int32))
weight_sparse = tf.sparse.from_dense(weight)
output = tf.nn.embedding_lookup_sparse(features, idx_sparse, weight_sparse)
return output
def get_config(self):
config = {}
base_config = super(MyLayer, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
This non-parameter custom layer introduces ~60K .tflite size, which is unaffordable in my case, because the rest of my BNN module only introduces 16K .tflite size (slightly bigger than the theoretical model size). So I want to lower its size.
It seems that the extra size is introduced by the sparse tensor. I make a simple test to prove it. Using the following call() function to replace the original one in MyLayer, the overall size of .tflite is 17K, which means this MyLayer only add 1K to .tflite. (This MyLayer only contains two matmul operators.)
def call(self, inputs, mask=None):
data= inputs[0] # A dense tensor with size of [N, D]
idx = inputs[1] # A dense tensor with size of [N, M]
output = K.dot(K.dot(idx, tf.transpose(idx)), features)
return output
Then, when the sparse tensor is involved, like in the following case, I just convert a dense tensor to a sparse tensor and then convert it back to the dense one. The overall size of .tflite becomes 82K!
def call(self, inputs, mask=None):
data= inputs[0] # A dense tensor with size of [N, D]
idx = inputs[1] # A dense tensor with size of [N, M]
idx_sparse= tf.sparse.from_dense(idx)
idx = tf.sparse.to_dense(idx_sparse)
output = K.dot(K.dot(idx, tf.transpose(idx)), features)
return output
So, I'm wondering why the sparse tensor introduces so much extra size of .tflite, and how I lower it and implement the Sparse-Dense Matrix Multiplication operator. Can you give me some hint?
Hi there, I'm now working to benchmark my BNN with LCE. Following the instruction of larq and lce, I first implement my BNN with Keras and convert it to
.tflite
by LCE. However, the actual size of my.tflite
is much bigger than the theoretical expected one, reported by lq.models.summary(). I find that the extra size (about 60K) is introduced by a non-parameter custom keras layer in my model, which is show below. (This custom layer is used as a substitute forSparse-Dense Matrix Multiplication
.)This non-parameter custom layer introduces ~60K
.tflite
size, which is unaffordable in my case, because the rest of my BNN module only introduces 16K.tflite
size (slightly bigger than the theoretical model size). So I want to lower its size.It seems that the extra size is introduced by the
sparse tensor
. I make a simple test to prove it. Using the followingcall()
function to replace the original one inMyLayer
, the overall size of.tflite
is 17K, which means thisMyLayer
only add 1K to.tflite
. (ThisMyLayer
only contains two matmul operators.)Then, when the sparse tensor is involved, like in the following case, I just convert a dense tensor to a sparse tensor and then convert it back to the dense one. The overall size of
.tflite
becomes 82K!So, I'm wondering why the sparse tensor introduces so much extra size of
.tflite
, and how I lower it and implement theSparse-Dense Matrix Multiplication
operator. Can you give me some hint?Thank you.