sirwhinesalot / table-enum

A convenient rust macro to create enums with associated constant data (note: this is different from normal rust enums which are really tagged unions)
MIT License
1 stars 1 forks source link

LUT support #1

Open Caellian opened 1 year ago

Caellian commented 1 year ago

Instead of generating large switch statements I suggest this macro adds a lut attribute for fields which allows generating lookup tables for O(1) access at runtime:

table_enum! {
    enum BinaryOp(#[lut] value: u32, current: NonConstInitiableType) {
        Add(6, NonConstInitiableType::construct("a")),
        Sub(2, NonConstInitiableType::construct("b")),
        Mul(7, NonConstInitiableType::construct("c")),
        Div(4, NonConstInitiableType::construct("d")),
        Pow(4, NonConstInitiableType::construct("e")),
        ...
    }
}

which would then generate code that looks like:

#[repr(usize)] // default is usize if other repr not specified
enum BinaryOp {
        Add,
        Sub,
        Mul,
        Div,
        Pow,
}

impl BinaryOp {
  const ValueLUT: [MaybeUninit<u32>; 5] = unsafe {
    let mut lut: [MaybeUninit<u32>; 5] = MaybeUninit::uninit().assume_init();
    lut[BinaryOp::Add as usize] = MaybeUninit::new(6);
    //...
    lut[BinaryOp::Pow as usize] = MaybeUninit::new(4);
    lut
  };

  pub const fn value(&self) -> u32 {
    unsafe {
      //SAFETY: LUT is fully populated, so any *self offset will exist
      BinaryOp::ValueLUT.as_ptr().add(*self as usize).read().assume_init()
    }
  }

  // pub const fn current is handled normally
}

I currently have a use case with representing JVM bytecode instructions like that and I wrote a small macro for that, but it's basically doing what I described.

Note that this example doesn't require MaybeUninit but types that don't implement Copy will not work with something as simple as:

let mut lut = [0; 5];
sirwhinesalot commented 1 year ago

Hi Caellian, wow I wasn't expecting anyone to actually have a look at table_enum, I wrote it just to learn how to work with rust proc macros and how to publish on crates.io haha.

I'm open to switching to a LUT-based implementation but won't the Rust compiler already turn the switch statements into one?

Caellian commented 1 year ago

It will evaluate const fns at runtime only if it knows their values.

So something like:

BinaryOp::Add.value()

will be evaluated at compile time and place the value in generated binary.

While something like:

fn do_something(with_op: BinaryOp) -> u32 {
  with_op.value()
}

will end up looking like: 2023-11-12-151232_hyprshot Godbolt link

While the LUT approach produces: 2023-11-12-152624_hyprshot Note that most of the assembly generated is array access check (jump instructions).

With something more evil like:

pub const fn what_is(&self) -> &'static str {
    unsafe {
        Example::LUT.as_ptr().add(*self as usize).read().assume_init()
    }
}

the grayed out function assembly boils down to only:

movzx   eax, byte ptr [rdi]
shl     rax, 4
lea     rdi, [rip + .L__unnamed_4]
add     rdi, rax
mov     rax, qword ptr [rip + core::ptr::const_ptr::<impl *const T>::read@GOTPCREL]
call    rax
pop   rcx
ret

and that's safe, because we know that generated LUT will have an entry at starting address + *self, and we can read that without LUT[offset] generating unreachable panic code. (Godbolt link)

sirwhinesalot commented 1 year ago

You've sold me on it. I won't have time to touch this again for awhile, but you could have a look at const-table instead. Unfortunately they use an array of structs instead of a struct of arrays approach.

table-enum 2.0 (if I ever get to it) will switch to a const-table style syntax for the enum declaration, but with SoA layout as the current table-enum 1.0 plus the new #[option] and #[default] attributes.