alexmadeathing commented 2 years ago

As of release #41 the inner type used to store the dilated form of an integer is a different type to that of the undilated value passed in. It needs to be large enough to house the dilated form of any value. For example, DilatedInt::<u8, 2> actually uses a u16 behind the scenes (2 * 8 = 16). This strategy is advantageous because it means there is no maximum value in the source integer; no runtime panics if a value can't be dilated - instead, if an integer can't be dilated because it is too large, it causes a compile time error which can be caught early.

That's all well and good, and I think it is the expected mode of operation for most developers (note that there are other libraries that exist which operate in the same way, eg. morton-encoding).

But what happens when we, for example, 3-dilate a u8? The number of bits used is 24 (3 * 8) and therefore the integer type used to store the dilated value needs to be a u32. This leaves 8 bits of wasted memory - which doesn't sound a lot on its own, but consider that when many dilated ints are stored, 25% of the memory used is wasted. Put another way, users may want to maximise the number of dilated values they can store in a given type, but are unable to because of the current behaviour.

The original operation (pre. 0.4.0) of this library used the same type for storage as the value passed in for dilation. This had the downside that there was a maximum value that could be dilated and going over this maximum value would cause a panic - but it had the upside that the absolute maximum number of bits would be used in the storage type. For example, u32 3-dilated in this way can contain up to 10 3-dilated source bits instead of 8, and therefore utilise 30 out of 32 bits (only 6.25% wasted memory). These numbers differ for different combinations of D and T, but the principle is the same across the board.

It is therefore my proposal that the old behaviour should be available in some form in this library. Either by some trait magic, or by a completely separate FixedDilatedInt struct. I am yet to decide on how best to structure this.

alexmadeathing commented 2 years ago

I've made extremely good progress on this issue. The implementation is done and I just need to write the docs.

Implementation

DilatedInt now takes a generic adapter <A> instead of <T, D>
The adapter describes the type of dilation, including the inner and outer types involved, wrapper methods to forward to the appropriate dilation functions, and some useful constants
At the moment, there are two types of adapter impls:
- Expand<T, D> is an adapter which describes the current dilation implementation - the inner type adapts to the combination of T and D, choosing the smallest type that would contain all bits of the outer type D-dilated. There are naturally a limited number of T and D combinations available as the the largest integer type supported is u128
- Fixed<T, D> is an adapter which describes the pre-0.4.0 dilation implementation - the inner type is the same as the outer type and a limited subset of bits in the outer type can be dilated. This is useful when you want absolute control over the inner type used and want to fit as many dilated bits into the inner type as possible.
In both cases, the number of possible dilated bits is given by Adapter::DILATED_BITS and the maximum dilated value is given by Adapter::DILATED_MAX (formerly the DilatedMask trait which was discarded in favour of constants closely mimicking the rust standard T::BITS and T::MAX).
As a result of this change, I am no longer relying on macros to produce implementations for different types - almost everything except the adapters uses regular hand written trait impls without repetition. The code is easier to read as a result.
We no longer use From. The use cases of From do not accurately convey what is happening when converting to and from dilated integers. Users should now use the explicit regular_uint.dilate_expand::<D>() or regular_uint.dilate_fixed::<D>() and dilated_int.contract() methods. Note that regular_uint.dilate_expand::<D>() returns a DilatedInt<Expand<T, D>>, regular_uint.dilate_fixed() returns a DilatedInt<Fixed<T, D>> and dilated_int.contract() returns a the original type of the uint T.

Breaking changes

DilatedInt now takes an adapter <A> in place of <T, D>. This is to allow for different formats of dilation. The current behaviour may be implemented using DilatedInt<Expand<T, D>>. The old pre-0.4.0 behaviour may be implemented using DilatedInt<Fixed<T, D>>.
You now convert between ints and DilatedInt using the explicit regular_uint.dilate_expand::<D>() or regular_uint.dilate_fixed::<D>() and dilated_int.contract() methods. This is because use cases of the From impls are extremely hard to read at a glance. Explicit methods better show what is happening.

alexmadeathing commented 2 years ago

I'm nearly done with the documentation and will put the finishing touches on it tomorrow.

I think we should be good for another minor version bump after this. This is a very nice upgrade.

alexmadeathing commented 2 years ago

For reference, the worst offender when using the expand method is u16 5-dilated.

Expand::<u16, 5>::DILATED_MAX == 0x00000000000008421084210842108421u128

See all those 0s?! The reason for this is that 16 bits * 5 = 80 bits - only a few bits beyond the size of a u64. As a result of exceeding a u64, Expand will allocate a u128 as the storage type even though hardly any of the extra bits are used. That's space in the u128 you simply can't get access to with Expand.

The Fixed adapter solves this issue by occupying the maximum number of bits possible in the target storage type.

Fixed::<u128, 5>::DILATED_MAX == 0x01084210842108421084210842108421u128 <- Much better!

This is the behaviour that justified this recent bit of work.

alexmadeathing / dilate

Support fixed types (original pre-0.4.0 behaviour) #42

Implementation

Breaking changes