Endianness of the .weights file

ghost commented 5 years ago

This is not an issue per se, but I wonder whether endianness is respected when compiling for specific targets. I would guess the answer is "no," and I would also guess that it would be pretty difficult to implement, but I'm asking for confirmation.

For example, consider the following command line (which is what I use to compile a dynamically loadable bundle to run on ARM v7 Neon with Linux hard-float ABI):

vagrant@ubuntu-clean-for-test:~/dts/nn/xgmr/build/debug$ ./bin/x-model-builder -input-tensor-dims=1,2 -model-input-name=in -model=../../../xgmr.old/pytorch_tuts/ffnn.onnx -emit-bundle=../../../workbench/glow/bundles/00_ffnn_nonquantized/ -cpu -network-name=ffnn -llvm-compiler=clang++ -target=armv7-neon-linux-gnueabihf -llvm-compiler-opt="-shared -target armv7-neon-linux-gnueabihf -B /usr/arm-linux-gnueabihf/bin"

(x-model-builder is my own builder based on image-classifier). Endianness is not an issue here, but in principle if I was targeting some other hardware (e.g. SPARC) it might be.

If in fact target-specific endianness is not respected when producing the .weights file, this should probably be explicitly stated.

opti-mix commented 5 years ago

It is a good question! Currently BundleSaver (the class that stores weights into a file) does not care about the endianness.

But it should be relatively straight forward to extend the logic of e.g. BundleSaver::saveWeights to store weights using a specific endianness. It could go over the payload of each weights tensor and store each element using a specific endianness. The reader of the weights may need to be adjusted accordingly.

ghost commented 5 years ago

Ah, yes, it shouldn't be difficult to save using specific endianness when endianness is given (e.g. an extra option to the bundle builder), but deducing correct endianness from target options might be more involved.

I think it is sufficient to simply state somewhere in the docs that endianness is not respected (even though cross-compilation of bundles is possible), so on the inference end, when one loads the weights, one must take care of endianness. This isn't much of a burden (neither on the development end, nor on performance end since weights are loaded only once upon initialization).

nickgg commented 5 years ago

In that case I think it would be sensible to define a default endianness and document that writing/reading should be done in that mode.

ghost commented 5 years ago

@nickgg: Well, if that is more efficient than simply stating that the default endianness agrees with native endianness (i.e. that of the platform that the bundle is being compiled on), then I think we should pick the default to be little endian (since it agrees with most desktop environments -- x86 and beyond).

This will require changes to the bundle saver (if, by some off chance, the build machine is big endian, then we have to explicitly save weights in little endian). It may be easier to just say that the .weights file is saved using native endianness (native to the build platform, not necessarily the target platform).

nickgg commented 5 years ago

That's easier to do (in that we don't need to do anything right now) but it means you'd need to remember what the compile platform was - which could complicate distribution of binaries.

pytorch / glow

Endianness of the .weights file #3160