orlp / slotmap

Slotmap data structure for Rust
zlib License
1.13k stars 70 forks source link

Allow customization of key sizes #52

Open orlp opened 3 years ago

orlp commented 3 years ago

An investigation is necessary to see if we can safely allow customization of key sizes. Something along the following lines should work:

pub trait Key {
    type Index: KeyIndex;
    type Version: KeyVersion;
}

pub trait KeyIndex: private::Sealed {
    #[doc(hidden)]
    type T: UInt;
}

pub trait KeyVersion: private::Sealed {
    #[doc(hidden)]
    type T: UInt;
    #[doc(hidden)]
    type NonZeroT: NonZeroUInt;
}

// All the ops needed for implementing.
#[doc(hidden)]
pub trait NonZeroUInt: Copy + PartialEq + Eq /* ... */ {}
#[doc(hidden)]
pub trait UInt: Copy + PartialEq + Eq /* ... */  {}

mod private {
    pub trait Sealed {}
}

// Implement all for u8, u16, u32, u64, u128...
impl UInt for u32 {}
impl NonZeroUInt for NonZeroU32 {}
impl KeyIndex for u32 {
    type T = u32;
}
impl KeyVersion for u32 {
    type T = u32;
    type NonZeroT = NonZeroU32;
}

impl private::Sealed for u32 {}

pub struct KeyData<I: KeyIndex, V: KeyVersion> {
    idx: I::T,
    version: V::NonZeroT,
}

// Allow following syntax:
new_key_type! {
    struct DefaultKey;             // u32 index/version
    struct SmallKey(u16, u16);     // u16 index/version
    struct HardenedKey(u32, u128); // u32 index, u128 version
}
orlp commented 3 years ago

Eureka! If we implement this we can also add a marker indicating whether a version is allowed to wraparound or not.

For security sensitive application where under absolutely no circumstances a key might alias, ever, regardless of access pattern instead of allowing a slot version to wrap around we can simply retire a slot after its version field has reached the maximum value.

In fact, I will probably change the default to this behavior, and only allow wraparound as an opt-in for extremely memory leak sensitive applications.

Robbepop commented 3 years ago

Would this allow use cases that never need to remove keys?

Currently SlotMap adds overhead to use cases where keys never need to be removed after first added. Maybe this issue describes some way towards this feature where SlotMap methods for key removal would also be missing so that no protection through generational indices is required anymore.

orlp commented 3 years ago

@Robbepop I don't think that's in scope for slotmap. The whole point of slotmap is that it provides a data structure that allows for safe insertion and deletion, without deletion you could just use a Vec and indices.

Calandiel commented 2 years ago

Hmmm, wouldn't there still be a benefit of having type safe Keys instead of indices into a vector?

nicoburns commented 1 year ago

@orlp I don't suppose there's any chance that you will time to get around to this any time soon? (or if not, would you be open to accepting PRs making such a change?)

We are using slotmap the backing storage in taffy, and smaller slotmap keys would be useful to to us because we want to add a Calc variant (containing a large complex heap-allocated structure) to our Dimension enum. However, that enum is currently only 32 bits (excluding enum tag), and it is used (and persisently stored in memory) a lot in our code doubling it to 64 bits would constitute a large increase increase in library's memory usage, which would be unfortunate for a relatively rarely used feature.

Ideally we would like something like a 24 bit index with 8 bit version, although 16 + 16 could probably also be made to work if necessary.

orlp commented 1 year ago

@nicoburns Well, I certainly plan on getting around to this Soon:tm:, but my Soon:tm: might not match your soon, and usually it doesn't even match my own idea of soon :(

I would love to accept PRs if not for the fact that this is quite core to the crate and I want it done Right:tm:, as I will probably only get one, maybe two shots at releasing a version of slotmap with core breaking changes. I'm not saying that you wouldn't do it Right:tm:, but my Right:tm: might not be your "right", and if I don't take it slow and really think things through, it won't be my right either.

There's a bit of a dichotomy in that a lot of people agree that Agile is the way to go, with many small releases, but at the same time everyone expects their libraries to not have breaking changes. But the only way I can do that is if I really get it right the first, maybe second time.

I have a big upcoming project release in the coming three weeks, so it's unlikely I'll get some work on this before then, but after that I'll be able to dedicate a weekend here and there to it.

Ideally we would like something like a 24 bit index with 8 bit version

I would love to be able to support that too, I should look around for a decent U24 type. Alignment tends to make such a thing a pain though.