Issues with word-based binary weight format

analog-cbarber commented 6 years ago

Currently, QFullyConnected and QConvolution when using the binary weight format (binarized_weights_only=True) store weights packed into machine words. There are two issues with this:

It is difficult to share the same weights between architectures with different word size because the operator demands that the size of the weights is in units of machine word. It seems like there is no reason that a model that works on a 64-bit machine should not work without modification on a 32-bit machine.
The format inherently assumes a particular byte ordering, so weights will not load correctly when going between a bigendian and little endian machine. This may only matter when loading weights from disk.

It seems bad to have a weight format that is not portable across machine architectures.

Alternatives would be:

bytes
use int32 regardless of machine word
allow multiple integer types as long as bit size times items is equal to bits in row

analog-cbarber commented 6 years ago

Another issue is that mxnet does not yet fully support int64:

>>> import mxnet as mx
>>> mx.ndarray.zeros((3,4), dtype=np.int64)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/cbarber/ws/bmxnet/python/mxnet/ndarray/utils.py", line 67, in zeros
    return _zeros_ndarray(shape, ctx, dtype, **kwargs)
  File "/Users/cbarber/ws/bmxnet/python/mxnet/ndarray/ndarray.py", line 3387, in zeros
    return _internal._zeros(shape=shape, ctx=ctx, dtype=dtype, **kwargs)
  File "<string>", line 34, in _zeros
  File "/Users/cbarber/ws/bmxnet/python/mxnet/_ctypes/ndarray.py", line 92, in _imperative_invoke
    ctypes.byref(out_stypes)))
  File "/Users/cbarber/ws/bmxnet/python/mxnet/base.py", line 146, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
MXNetError: Invalid Input: 'int64', valid values are: {'float16', 'float32', 'float64', 'int32', 'uint8'}, in operator _zeros(name="", dtype="int64", ctx="cpu(0)", shape="(3, 4)")

zeros is called during setup of deferred initialization for gluon Parameters, so this would hamper the use of this feature.

analog-cbarber commented 6 years ago

The zeros/int64 problem is now MXNet issue #9536. I am afraid that until this is fixed it probably will not be feasible to support binarized weight parameters in the gluon interface as long as they use 'int64'.

hpi-xnor / BMXNet

Issues with word-based binary weight format #33