reinerp / varlen-rs

Ergonomic variable-length types in Rust
BSD 3-Clause "New" or "Revised" License
7 stars 1 forks source link

dynamic alignment support #2

Open ericmarkmartin opened 1 year ago

ericmarkmartin commented 1 year ago

Hello.

I am trying to implement an objective-c runtime in rust. I'm currently working on building dispatch tables. The runtime's api allows dynamic creation of classes (and thus the dynamic addition of instance variables and methods to the class), causing the dispatch tables to have variable length, hence my interest in this crate.

One issue, however, is that in addition to the dispatch tables having variable length, they can also have variable alignment, depending on what types of instance variables have been added to the class the object is an instance of. Unfortunately, the VarLen trait requires a constant alignment.

Is there a way around this for me?

reinerp commented 1 year ago

I designed in constant alignment to make the runtime offset calculations cheaper. In theory we could extend this with support for dynamic alignment, together with an option to say "this type's alignment is actually known statically; no need to calculate it dynamically". I haven't thought through all the implications of this.

Can you pad alignment up to a predefined max alignment? E.g. universally forcing alignment of 16 bytes is fairly common in some systems.

ericmarkmartin commented 1 year ago

Thanks for getting back to me so quickly!

I designed in constant alignment to make the runtime offset calculations cheaper. In theory we could extend this with support for dynamic alignment, together with an option to say "this type's alignment is actually known statically; no need to calculate it dynamically". I haven't thought through all the implications of this.

That makes sense---how difficult do you think adding this dynamic alignment would be?

Can you pad alignment up to a predefined max alignment? E.g. universally forcing alignment of 16 bytes is fairly common in some systems.

This would probably work for most objective-c programs, but the runtime spec allows consumers to specify arbitrary (power of 2) alignments when adding instance variables to classes.

reinerp commented 1 year ago

Thanks for getting back to me so quickly!

I designed in constant alignment to make the runtime offset calculations cheaper. In theory we could extend this with support for dynamic alignment, together with an option to say "this type's alignment is actually known statically; no need to calculate it dynamically". I haven't thought through all the implications of this.

That makes sense---how difficult do you think adding this dynamic alignment would be?

In principle it's doable. We'd remove const ALIGN: usize; from the VarLen trait, and instead add fn align(&self) -> usize; to the Layout trait. We'd make the other fairly mechanical changes that follow from that.

That gets us to a point where the trait system at least allows you to do what you'd like. However, the types produced by the #[define_varlen] procmacro would still have constant alignments, because variable-length arrays inherently have constant alignments, e.g. in the following we can directly read off that Foo has alignment of 2 bytes.

#[define_varlen]
struct Foo {
    #[controls_layout]
    len: u8,

    #[varlen]
    arr: [u16; *len as usize],
}

In fact, I can't immediately see how to map your usecase to the #[define_varlen] macro, even with potential extensions to that macro. I'm probably just lacking imagination. Can you give an example of the #[define_varlen] struct definition you'd like to use (imagining hypothetical extensions of the procmacro as you'd want), and then we can figure out how to make that work?

ericmarkmartin commented 1 year ago

In fact, I can't immediately see how to map your usecase to the #[define_varlen] macro, even with potential extensions to that macro. I'm probably just lacking imagination. Can you give an example of the #[define_varlen] struct definition you'd like to use (imagining hypothetical extensions of the procmacro as you'd want), and then we can figure out how to make that work?

I'm actually not using the #[define_varlen] macro for the type in question. I have a generic struct

#[repr(C)]
pub struct Repr<T> {
    /// Pointer to this object's class.
    is_a: Receiver,
    data: T,
}

This is supposed to help me model the objective-c id type, which can point to an object or a class.

Then I fill the T with either ClassData which is a normal fixed length struct or ObjectData, which is defined like

#[define_varlen]
pub struct ObjectData {
    #[controls_layout]
    pub(crate) ivar_bytes: usize,

    #[varlen_array]
    pub(crate) ivars: [u8; *ivar_bytes],
}

.

I'm simplifying by assuming the dispatch table is just instance variables and not methods or the like. The when we create an instance of the class, we have access to a std::alloc::Layout for the ivars, which is where the dynamic alignment comes from. That means, however, that I'm having trouble implementing VarLen for Repr. I never even attempted to use #[define_varlen] for Repr b/c it seemed struggle w/ the generics.

reinerp commented 1 year ago

Thanks! So if I'm understanding this correctly, if you wanted dynamic alignment support, you'd probably do something like this:

#[define_varlen]
pub struct ObjectData {
    #[controls_layout]
    pub(crate) ivar_bytes: usize,
    #[controls_layout]
    pub(crate) ivar_align: usize,

    #[varlen_array]
    #[varlen_align = *ivar_align]    /// Hypothetical new construct to force dynamic alignment of a field
    pub(crate) ivars: [u8; *ivar_bytes],
}

Am I getting that right?

(The generics issue with #[define_varlen] for Repr is probably fixable. I might have just not implemented that.)

ericmarkmartin commented 1 year ago

Thanks! So if I'm understanding this correctly, if you wanted dynamic alignment support, you'd probably do something like this:

Yeah that's exactly right! It'd be extra nice if I could just stick a std::alloc::Layout there instead of bytes and align, but not a huge deal.

Regarding #[define_varlen] for generics, that'd also be super nice; I think I could implement VarLen for it myself but I'm always happy to have less unsafe code on my end.

ericmarkmartin commented 1 year ago

Is there anything I can do to help realize this in varlen? I'm not super comfortable developing macros but willing to learn.

reinerp commented 1 year ago

Thanks for the offer!

I'll give some advice at the end of this comment for how you can try out what I described above. However, I should say that I still have some reservations about whether this alignment support is a good fit for (a) your problem and (b) the varlen library.

Regarding (b), my reservation is that varlen is aimed at the use case of storing variable-length arrays inline in objects in support of better (faster and smaller) memory layouts than Vec<> would offer. The alignment support currently built into varlen is sufficient for that, because we're using arrays of known types. In contrast, alignment support only seems to show up when we have unknown (dynamic) typing, as in your scenario. I suspect that when doing dynamic typing of this form, there are many additional concerns that the varlen library doesn't handle: for example, safely transmuting from the (static) child type to the (dynamic) parent type; or doing method dispatch for known methods of the child type; etc. I didn't intend for varlen to solve all of those concerns, and I suspect that just solving this alignment issue for that use case doesn't provide enough of a help to be worth it in that case.

Regarding (a), from the description you've given I suspect you must already need a substantial amount of code to (1) calculate the layout for instance variables and (2) read/write instance variables and bytecast/transmute between u8 arrays and the true types of your instance variables. I don't think varlen will provide assistance with those concerns, and I suspect you will need to end up writing unsafe code for those activities. My suspicion is that, once you've bitten off (1) and (2) and the unsafe code that they imply, the amount of code and unsafety savings you'd get from using varlen is very minimal.

Perhaps I'm wrong in understanding what value varlen would provide in your situation. I think one way to quickly prototype varlen in your code would be to use pub(crate) ivars: [u64; *ivars_len], i.e. u64 array instead of u8 array. This forces the alignment to be 8 bytes, which I suspect is large enough for most use cases, even if theoretically there are more demanding use cases. On addIvars calls with larger alignments you could panic!() (for now) or store them behind another level of pointers (inside an AlignedBox) or similar. This would be one way to get some quick experience on how much value varlen provides in your use case. It would also be pretty helpful to me in understanding the overall picture, with respect to my hesitations listed above.

Guidance on trying this out in varlen

In spite of the above, I'll list what it would take to try out this dynamic alignment support in varlen.

The first part is this, which I mentioned above:

We'd remove const ALIGN: usize; from the VarLen trait, and instead add fn align(&self) -> usize; to the Layout trait. We'd make the other fairly mechanical changes that follow from that.

Given your example and after thinking through it more, I'm supportive of this change. If you follow the example of fn size(&self) -> usize; and see all the places that is used, I think you should be able to do roughly the same for the ALIGN field, and you should hopefully be able to make this change.

The second part is the implementation of the #[varlen_align = ...] field. This part is probably more involved: we'd need to extend the procmacro to parse that field and insert the appropriate code into the layout calculations. The best example to follow would be how the #[varlen_array] is parsed and how the [T; LEN] part of the type is parsed. The parsing will be different in this case (because you're not parsing a type, you're parsing an attribute), and you can use syn::Attribute to parse that.