larq / compute-engine

Highly optimized inference engine for Binarized Neural Networks
https://docs.larq.dev/compute-engine
Apache License 2.0
243 stars 35 forks source link

Separate our 'bsign' and 'bconv' ops #447

Closed AdamHillier closed 4 years ago

AdamHillier commented 4 years ago

At the moment, our bconv op can read either float or int8 or bitpacked inputs, and can write either float or int8 or bitpacked outputs. When the input is not bitpacked, the first step that the op performs is calling a bitpacking procedure. Conceptually, this is almost like a 'sub-op', which doesn't need to be the responsibility of the bconv. We could instead make the bconv op accept only bitpacked inputs, and insert dedicated bsign ops where needed that take float or int8 inputs and produce bitpacked ouptuts. A bsign op already exists in our code base, but we generally fuse it into a subsequent bconv and so don't use it directly.

This would have several advantages:

A prerequesite for doing this is #446, so this is blocked for now.

lgeiger commented 4 years ago

I roughly implemented the required changes in the converter which was rather painless. Though for the conversion patterns when reading and writing bitpacked output of the bconv it might make sense to discuss some of the implementation details beforehand. We briefly discussed the three possible implementations before, but I think it might be useful to have them explicitely written down here for reference:

Option 1: "Dequantize/Quantize"

The main idea here is that we introduce an quantize and dequantize op to replace the current bsign (alternative names would be bsign/dequantize, binarize/dequantize, pack_bits/unpack_bits or pack/unpack, ...). The signature of the ops would be {float32, int8} --> {int32} and {int32} -> {float32, int8} for quantize and dequantize respectively. This is very close to how TFLite and MLIR handle int8 quantization and would be how the converter will handle it internally in any case, regardless of of which possibility we pick for the kernels. While this would introduce a new op in LCE, the only other downside I see is that float32 -> float32 binarization would be potentially slower as it would become float32 -> int32 --> float32. In general I don't think this is too problematic as float32 -> float32 binarization is not really useful in practice anyway.

Option 2: "All in one Bsign"

This option would combine the previous approach in a single bsign op with the signature {float32, int8, int32} --> {float32, int8, int32}. While this makes the kernel implementation a bit more complicated it would remove the need to add a new op.

Option 3: "Current Bsign + bitpacked output"

This is the option wich is closest to the current apprach where we would add the possibility to output a bitpacked tensor, so that the signature of the bsign op would become {float32, int8} --> {float32, int8, int32}. With this option we need to be a bit careful in the converter and need to make sure to only convert tfl.maxpool_2d --> lq.bsign to lq.bsign -> lq.bmaxpool_2d in cases where it is followed by a bconv op that supports bitpacked int32 input since otherwise the converter would output a broken graph.