This PR makes small alterations to the .nnue storage format and adds support for LEB128 compression of weights (right now only for the feature transformer). An additional --ft_compression CLI parameter may be used when serializing a network to .nnue. Note that it's also valid to serialize from .nnue to .nnue, which can be used for compressing already existing networks. Example:
Now, every saved tensor may have a header, right now only present if the tensor is compressed. This header specifies the compression method used, and may contain other information useful for decoding. Since this header is optional to maintain backwards compatibility it is marked by a relatively long magic string so that the chance of a collision with an otherwise valid old network is minimal.
For LEB128 compression the header consists of a string "COMPRESSED_LEB128" encoded in utf-8, followed by little-endian int32 equal to the number of bytes taken by the compressed tensor data. Stockfish can of course choose to support only one variant, since we have full control over how network are made and we don't require support for networks other than the default one.
This PR makes small alterations to the .nnue storage format and adds support for LEB128 compression of weights (right now only for the feature transformer). An additional
--ft_compression
CLI parameter may be used when serializing a network to .nnue. Note that it's also valid to serialize from .nnue to .nnue, which can be used for compressing already existing networks. Example:Now, every saved tensor may have a header, right now only present if the tensor is compressed. This header specifies the compression method used, and may contain other information useful for decoding. Since this header is optional to maintain backwards compatibility it is marked by a relatively long magic string so that the chance of a collision with an otherwise valid old network is minimal.
For LEB128 compression the header consists of a string "COMPRESSED_LEB128" encoded in utf-8, followed by little-endian int32 equal to the number of bytes taken by the compressed tensor data. Stockfish can of course choose to support only one variant, since we have full control over how network are made and we don't require support for networks other than the default one.
Sibling PR on Stockfish side. https://github.com/official-stockfish/Stockfish/pull/4617
Thanks to @MaximMolchanov for suggesting using numba and providing fast leb128 encode/decode functions.