It might make sense to have a packed datatype, or maybe packed8, packed32 etc which is really just uint8 / uint32 under the hood.
This way, we can have the bitpacking op that sends float to packed32, and then the bgemm op should only accept packedXX as inputs and produces intYY as outputs. Then it will properly give an error if you try to run bgemm on non-packed data. This way, if we have some operation that is supposed to support both normal uint8 and bitpacked things, it can distinguish those.
We can rename the current bitpacking operation to bsign because it is kind of the sign operator but which has a different result type, namely packedXX.
EDIT: They way to implement these in tensorflow (not lite) is by using their DT_VARIANT datatype which supports arbitrary C++ structs. So we can use something like
Since the full bconv2d operator will only have the packed datatype internally and has non-packed inputs and outputs, I think we do not have to do this.
It might make sense to have a
packed
datatype, or maybepacked8
,packed32
etc which is really justuint8
/uint32
under the hood. This way, we can have the bitpacking op that sendsfloat
topacked32
, and then thebgemm
op should only acceptpackedXX
as inputs and producesintYY
as outputs. Then it will properly give an error if you try to run bgemm on non-packed data. This way, if we have some operation that is supposed to support both normaluint8
and bitpacked things, it can distinguish those.We can rename the current bitpacking operation to
bsign
because it is kind of the sign operator but which has a different result type, namelypackedXX
.EDIT: They way to implement these in tensorflow (not lite) is by using their
DT_VARIANT
datatype which supports arbitrary C++ structs. So we can use something likeIt is somewhat explained in this blogpost