pack_size() and pack_size128() can be written in more compact form, taking advantage of CPU capabilities of counting leading zeros in a number. I've quickly checked that pack_size() on Aarch64 it shrinks down from 27 to 5 opcodes, and on X86_64: from 26 to 8.
pack_size()
andpack_size128()
can be written in more compact form, taking advantage of CPU capabilities of counting leading zeros in a number. I've quickly checked thatpack_size()
on Aarch64 it shrinks down from 27 to 5 opcodes, and on X86_64: from 26 to 8.