Open GKFX opened 10 months ago
Noting one potential limitation to offset_of with enums: without a way to set the discriminant, it cannot be used to manually initialize an enum.
Blog post explainer of what I mean: https://onevariable.com/blog/pods-from-scratch/
Zulip thread for discussion: https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Prior.20discussions.20for.20manually.20building.20an.20enum.3F/near/441992612
offset_of!
macro is a macro at expression location and it computes the usize
offset of, possibly nested, fields in the source type. This report concerns the use of this macro to compute offsets into enum
s meaningfully, namely covering support tracked by the following two issues.
The macro computes field offsets for both struct
s and enum
s, respecting field privacy.
This macro accepts two argument. The first is the source type that
// crate a
#![feature(offset_of_enum)]
#![feature(offset_of_nested)]
struct AStruct {
pub field: u8,
field2: u32,
}
enum AEnum {
Variant1(u8),
Variant2 {
field: u8
},
Variant3(AStruct),
}
struct ANestingStruct(AStruct);
// readings as of Rust 1.78.0
// DO NOT EXPECT THESE ASSERTIONS TO WORK ACROSS VERSIONS
fn main() {
assert_eq!(offset_of!(AStruct, field), 4);
assert_eq!(offset_of!(AStruct, field2), 0);
assert_eq!(offset_of!(AEnum, Variant2.field), 1);
assert_eq!(offset_of!(AEnum, Variant3.0.field), 8);
assert_eq!(offset_of!(ANestingStruct, 0.field), 4);
}
By respecting privacy, suppose that in another crate b
// crate b
#![feature(offset_of_enum)]
#![feature(offset_of_nested)]
fn main() {
// these still work
assert_eq!(offset_of!(AStruct, field), 4);
assert_eq!(offset_of!(AEnum, Variant2.field), 1);
assert_eq!(offset_of!(AEnum, Variant3.0.field), 8);
// this does not compile because field2 is private here
// assert_eq!(offset_of!(AStruct, field2), 0);
// this does not compile because 0 is private here
// assert_eq!(offset_of!(ANestingStruct, 0.field), 4);
}
The rustdoc of our core
library has been updated with code example enabled by these two feature gates.
The basic test cases for enum
s and nesting structures is covered by the following tests, including edge cases for parsing rule involving the field specifiction part of offset_of!
macro calls.
enum
senum
sThere has been discussion around in-place construction of enum
values and the possibility to know the location of discriminant with offset_of!
and manipulate discriminant value. This is still under inception phase and will possibly be supported by a separate utility or interface for safety and better UX.
I discovered that in feature(generic_const_exprs)
the support for THIR + MIR OffsetOf
is sort of half-baked.
Given that generic_const_exprs
is incomplete, the only remaining concern is to use it meaningfully at anonymous constant context, or otherwise it should not block stabilization of it for use in const fn
which is a viable substitute for a number of scenarios that I can think of. In any case, I am taking a small initiative to tackle it. :crossed_fingers:
@rfcbot fcp merge
Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:
Concerns:
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns. See this document for info about what commands tagged team members can give me.
There has been discussion around in-place construction of
enum
values and the possibility to know the location of discriminant withoffset_of!
and manipulate discriminant value. This is still under inception phase and will possibly be supported by a separate utility or interface for safety and better UX.
I don't think this is a blocker: the discriminant isn't encoded as a real field with an offset, so it can't be represented with offset_of
anyways.
@rfcbot concern consistent-future-syntax
Given that this item (unlike nested fields) isn't as urgent, and that we don't have any syntax for this elsewhere in the language (unlike nested fields), I'd like to make sure that either:
1) The syntax we choose here is consistent with any future syntax we might want to use elsewhere in the language, or 2) We don't actually want to provide any syntax for this elsewhere in the language, so we're fine with inventing a syntax for this here.
(We currently have pattern syntax for extracting fields, but the syntax proposed for offset_of
here is not pattern syntax. That doesn't necessarily mean it should be, but that means we're introducing a new syntax here and we should decide if that's the syntax we want.)
elsewhere in the language
The same syntax could be used for the proposed field_of! macro, RFC 3318. It could possibly be used for the proposed discriminant_of!
, although this seems less necessary as supporting discriminant_of!(Enum, Variant)
alone would likely be sufficient. A type_of!(Type, fields)
has also been suggested.
Outside of reflection-related macros, the syntax o.Some.0
(for an Option o
) implies to me accessing the Some
variant of an enum without doing pattern matching. This would mostly likely be an unsafe operation (unless the user had somehow previously proved to the compiler that o
is a Some
). Unsafely accessing an enum variant can already be done with let Some(o) = o else { unreachable_unchecked() };
, and is unusual enough that I doubt it will ever need a dedicated syntax. Alternatively, o.Some.0
could be understood to be in some way fallible, like a more concise alternative to pattern-matching, but again that seems unlikely to actually happen.
Marking lang nominated to discuss how we want the future landscape to look here, and how we want to provide this functionality. (There's enough of a use case for this; the question is syntax.)
Hey @joshtriplett, is this scheduled for a meeting agenda? If possible, I'd love to observe that meeting! I'm still very interested in the ability to construct enums in-place for deserialization reasons
It'll be in an upcoming meeting, but we don't have it scheduled for a particular date. cc @traviscross who would have a better idea of timing.
With feature(offset_of_enum)
alone it's possible to construct enums in-place with atleast opt-level=1
(for dead store elimination).
At opt-level=0
there are extra copies to the stack & back but that's pretty normal for opt-level=0
.
See: https://rust.godbolt.org/z/jz8qc1Pqv
Basically, we can emulate SetDiscriminant
although with slightly different semantics.
In-place construction is done by initializing fields in-place using offset_of
just like you would a struct then using something like the following:
/// initializes `dst` to be valid AnEnum::X containing the fields already initialized in `dst`
/// # Safety
/// - `dst` must be properly aligned pointing to an `AnEnum`
/// - `dst` must be valid for writes
/// - The value pointed at by `dst` must have all fields of variant `X` initialized and valid.
unsafe fn set_discriminant_x(dst: NonNull<()>) {
// All the reads that immediately write the same value to the same location
// get eliminated with alteast `opt-level=1` leaving only the discriminant write, if any.
let value = AnEnum::X {
a: dst.byte_add(offset_of!(AnEnum, X.a)).cast::<u32>().read(),
b: dst.byte_add(offset_of!(AnEnum, X.b)).cast::<u64>().read(),
c: dst.byte_add(offset_of!(AnEnum, X.c)).cast::<Box<str>>().read(),
};
dst.cast::<AnEnum>().write(value);
}
enum AnEnum {
X{ a: u32, b: u64, c: Box<str> },
Y([u8; 12]),
}
Note: Ideally a derive macro is generating the above function.
By using a function pointer you can set the discriminant in a type erased fashion as well. Also, even with many fields (20) the optimization consistently occurs in my tests so far. I believe this should resolve the limitation @jamesmunns faced, I think.
Of course, it would still be great to have dedicated support for manipulating discriminant.
However, I agree that the lack of such support should not block feature(offset_of_enum)
.
Full Zulip thread about the workaround: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/.E2.9C.94.20Workarounds.20for.20setting.20enum.20discriminants/near/470331454
My take:
I don't know any other place in the language where we might want to use this kind of syntax, though I can imagine it (e.g., MIR has similar paths of this kind). If we want certainty, though, the only syntax that we can be SURE will not conflict with anything else would be to use pattern matching.
Have folks explored pattern matching and what that might look like? I was imagining something like "supply a pattern with exactly one binding" (e.g., offset_of!(path @ Some(x))
).
(Personally though I feel reasonably good about the proposed Variant.field
syntax, and we could of course deprecate it in the future if it proves to be a mistake...)
Just confirming what @exrok said - I don't think "set discriminant" functionality should block this FCP, I'm working on a follow-up RFC for that, and that RFC would build on top of this one.
Edit: I've opened a PR for this RFC: https://github.com/rust-lang/rfcs/pull/3727
Feature gate:
#![feature(offset_of_enum)]
This is a tracking issue for using enum variants in offset_of. Enum variants themselves do not have an offset within their type, so the macro will not give an offset for them, but their fields do. For example, the standard library uses the offset of the content of the
Some
variant ofOption
to implementOption::as_slice
. The original RFC for offset_of was https://github.com/rust-lang/rfcs/pull/3308.Public API
Steps / History
Unresolved Questions