larq / compute-engine

Highly optimized inference engine for Binarized Neural Networks
https://docs.larq.dev/compute-engine
Apache License 2.0
243 stars 35 forks source link

Bit-packing of input tensor along channel dimension #52

Closed arashb closed 4 years ago

arashb commented 5 years ago

Current implementation of in compute-engine is as following:

However, it makes sense to bitpack the input matrix first along the channel dimension (you might need extra bitpadding) and do the im2col afterward (or fuzed bitpacking-im2col algorithm)

Tombana commented 5 years ago

Memory layouts: HWC // TF Lite Conv2D input --> easy to bitpack channels before im2col HWC // (default) Tensorflow Conv2D input --> easy to bitpack channels before im2col OHWI // TF Lite Conv2D weights --> easy to bitpack channels before im2col HWIO // (default) TensorFlow Conv2D weights --> not easy to bitpack channels before im2col

HWO // TF Lite DepthwiseConv2D weights (not 100% sure about this one) HWI // TensorFlow DepthwiseConv2D weights when channel_multiplier is 1

Tombana commented 5 years ago

See the comment in #57. We can stick to OHWI in both TF and TF lite.