f4b6a3 / uuid-creator

UUID Creator is a Java library for generating Universally Unique Identifiers.
MIT License
410 stars 44 forks source link

BaseN encoding speed improvement #63

Closed agentgt closed 2 years ago

agentgt commented 2 years ago

By using this libdivide4j I was able to double the speed of your BaseN encoder for non powers of 2.

public class FastDivNRemainderEncoder extends BaseNEncoder {

    private final int radix;
    private final int length;
    private final char padding;

    private static final int UUID_INTS = 4;
    private static final long HALF_LONG_MASK = 0x00000000ffffffffL;

    private final FastDivision.Magic magic;

    public FastDivNRemainderEncoder(BaseN base) {
        super(base);
        radix = base.getRadix();
        length = base.getLength();
        padding = base.getPadding();
        magic = FastDivision.magicUnsigned((long) radix);
    }

    @Override
    public String apply(@SuppressWarnings("null") UUID uuid) {

        // unsigned 128 bit number
        int[] number = new int[UUID_INTS];
        number[0] = (int) (uuid.getMostSignificantBits() >>> 32);
        number[1] = (int) (uuid.getMostSignificantBits() & HALF_LONG_MASK);
        number[2] = (int) (uuid.getLeastSignificantBits() >>> 32);
        number[3] = (int) (uuid.getLeastSignificantBits() & HALF_LONG_MASK);

        char[] buffer = new char[length];
        int b = length; // buffer index

        // fill in the buffer backwards using remainder operation
        while (!isZero(number)) {
            final int[] quotient = new int[UUID_INTS]; // division output
            final int remainder = remainder(number, quotient);
            buffer[--b] = alphabet.get(remainder);
            number = quotient;
        }

        // add padding to the leading
        while (b > 0) {
            buffer[--b] = padding;
        }

        return new String(buffer);
    }

    protected int remainder(int[] number, int[] quotient /* division output */) {

        long temporary = 0;
        long remainder = 0;

        for (int i = 0; i < UUID_INTS; i++) {
            temporary = (remainder << 32) | (number[i] & HALF_LONG_MASK);
            // quotient[i] = (int) (temporary / divisor);
            long q = FastDivision.divideUnsignedFast(temporary, magic);
            // remainder = temporary % divisor;
            long r = temporary - q * magic.divider;

            quotient[i] = (int) q;
            remainder = (int) r;
        }

        return (int) remainder;
    }

    private boolean isZero(int[] number) {
        return number[0] == 0 && number[1] == 0 && number[2] == 0 && number[3] == 0;
    }
}

I don't really fully understand how the algorithm works but it speeds up divmod-ing quite a bit on non power of 2s. I also don't understand why you modulus (%) after you already done the division. Maybe the JIT is smart enough. Doing multi and then subtracting (like I did here) might speed it up as well.

I also have an in house encoder that I call Fast57 (but it really could be any BaseN) that divmods the MSB long and the LSB long separately using the fast division algorithm and then concatenates them. That approach is about twice as fast as even the improved BaseN encoder from the above however it can't do arbitrary lengths of bytes since it relies on longs.

fabiolimace commented 2 years ago

Hi, @agentgt!

I really liked the libdivide4j library. But unfortunately I can't add a dependency for the specific case.

However I added a new parameter in BaseNCodec for you to use FastDivision. All you need to do is inject a function that uses FastDivision into a codec.

CustomDivider is a division function interface that returns quotient and remainder. It can be used to boost encoding speed utilizing specialized libraries like libdivide.

The addition of CustomDivider did not affect performance. In fact, the refactoring done to include it improved performance by 15% for decoding and 8% for encoding.

Here is a simple function example:

    CustomDivider divideBy64 = x -> new long[] { x / 64, x % 64 };

Here is an complete codec example using FastDivision:

    public class FastBase62Codec extends BaseNCodec {

        private static final int RADIX = 62;
        private static final BaseN BASE_N = new BaseN(RADIX);

        // make the magic happen with fast division :)
        private static final FastDivision.Magic MAGIC = FastDivision.magicSigned(RADIX);
        private static final CustomDivider DIVIDER = x -> {
            long quotient = FastDivision.divideSignedFast(x, MAGIC);
            long remainder = x - quotient * MAGIC.divider;
            return new long[] { quotient, remainder };
        };

        // a shared immutable instance
        public static final FastBase62Codec INSTANCE = new FastBase62Codec();

        public FastBase62Codec() {
            super(BASE_N, DIVIDER);
        }
    }

Benchmark before improvement:

-----------------------------------------------------------------
Benchmark                  Mode  Cnt     Score     Error   Units
-----------------------------------------------------------------
Base62Codec.decode()      thrpt    5  8667,315 ± 246,583  ops/ms
Base62Codec.encode()      thrpt    5  1310,038 ±  30,726  ops/ms
-----------------------------------------------------------------

Benchmark after improvement:

-----------------------------------------------------------------
Benchmark                  Mode  Cnt     Score     Error   Units
-----------------------------------------------------------------
Base62Codec.decode()      thrpt    5  9964,451 ±  99,106  ops/ms   +15%
Base62Codec.encode()      thrpt    5  1421,416 ±  41,249  ops/ms    +8%
FastBase62Codec.encode()  thrpt    5  3497,924 ± 135,835  ops/ms  +167%
-----------------------------------------------------------------

As you can see FastBase62Codec improved the performance by 2.5x. Thank you so much!

fabiolimace commented 2 years ago

Released version v4.6.0.

agentgt commented 2 years ago

The library is only one class so you might be able to just copy the class.

The algorithm believe is in the public domain and I didn’t see a license for either the c or java version.

fabiolimace commented 2 years ago

Making a copy of libdivide4j is also adding a dependency. I intend to keep this lib self-contained and efficient at the same time, although I know it's an unattainable ideal as nothing is really independent. Alternatively, I could just copy part of libdivide4j, but I'm not confident in adding part of something I can't fully understand. libdivide is really magic for me.

However, if a developer needs to speed up performance with libdivide4j, this can be done by injecting a custom division function. It can be considered an option. I didn't notice any significant performance difference when plugging FastDivision wrapped in a CustomDivider. I think the JIT compiler does a pretty good job of optimizing it for us. Please take a look at this benchmark I just did:

Benchmark                      Mode  Cnt     Score     Error   Units
InjectedWithCustomDivider     thrpt    5  3156,578 ±  23,137  ops/ms
IncludedAsDependency          thrpt    5  3225,171 ±  99,144  ops/ms  +2%

I really appreciate the interest and advice other developers give this library. But I respectfully refuse to add libdivide4j as a dependency.

agentgt commented 2 years ago

I respect that and appreciate it. I wish more libraries did that. I was only pointing it out as you might have thought it was a ton of code but its only one class. If en/decoding was the primary goal of this project I would make a bigger deal about it but its not.

EDIT: To be clear I agree with your decision 👍