Tracking Issue for `#![feature(offset_of)]`

WaffleLapkin commented 1 year ago

Feature gates tracked by this issue:

#![feature(offset_of)]
~~#![feature(offset_of_enum)]~~ -> moved to #120141

This is a tracking issue for the offset_of! macro which evaluates to a constant containing the offset in bytes of a field inside some type (https://github.com/rust-lang/rfcs/pull/3308).

Public API

// core::mem

pub macro offset_of($Container:ty, $field:tt $(,)?) {
    // ...implementation defined...
}

Steps / History

[x] Implementation: https://github.com/rust-lang/rust/pull/106934
[x] Split off offset_of_enum: #117537
[x] Final comment period (FCP)^1: old, new
[x] Stabilization PR: https://github.com/rust-lang/rust/pull/118799

Possible future extensions / work

[ ] Support for static alignment DSTs (struct Example(u32, [u8]))
[x] Support for enums: #114208
[ ] Improving diagnostics around the macro errors

Unresolved Questions

None yet.

Amanieu commented 12 months ago

I make use of offset_of! with nested fields and find the syntax very convenient. I previously used multiple offset_of! but the resulting code was often hard to read.

tgross35 commented 12 months ago

It just seems we aren't close to consensus on nested syntax due to various concerns:

Matching in user macros, above
User friendliness / discoverability
Consistency with whatever happens for enums (#114208, now gated by https://github.com/rust-lang/rust/pull/117537) and arrays
Parsing of nested tuple fields look like floats 1.2
Wanting this sort of meta field reference to work outside of offset_of whenever we use it in the future (like C++ member pointers, swift field paths, reflection)
Other things discussed on zulip https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/.60offset_of!.60.20Syntax

This is likely at least a few months of design work if not more, even if the bare minimum is somebody writing a justification proposal/examples defending the current syntax. We definitely want it still, but it doesn't seem worth indefinitely blocking the core functionality of offset_of on this when nested field access was only a future possibility in the RFC.

cc implementer @DrMeepster for any thoughts

beepster4096 commented 12 months ago

I agree with @tgross35 that we shouldn't block stabilization on nested field accesses. While it's nice to have, it's not a critical feature. We can let it bikeshed a little while longer.

est31 commented 12 months ago

Regarding @Amanieu 's remarks, they are useful feedback but note that I think nobody is arguing against nested uses at all: I think everyone here wants it. The question is more about the timing: it would be really great if we could stabilize some form of offset_of soon. The demand in the ecosystem is quite high from what I can tell.

I think most people have spoken out in favour of the MVP approach: maybe it is time for someone (@GKFX? @joshlf?) to make a PR to split off nested fields into a separate feature gate (#![feature(offset_of_nested)]?) and stabilize the single-field offset_of feature. Of course it needs a libs-api FCP, too (or maybe the current one could be reused?).

GKFX commented 12 months ago

I have a branch locally for stabilizing offset_of! taking only a single ident - I can try to get that finished.

Amanieu commented 11 months ago

We discussed this in the libs-api meeting today. We're happy to limit the stabilization to single fields for now while the discussion about nested fields and enum variants is ongoing. cc @BurntSushi

BurntSushi commented 11 months ago

@est31 Do you have ideas on how to move forward here with a partial stabilization?

scottmcm commented 11 months ago

👍 to stabilizing with just single fields for now. I agree with https://github.com/rust-lang/rust/issues/106655#issuecomment-1815867295 that we want multiple field support in this eventually, but that it's not worth blocking the basic functionality on figuring out how best to phrase the convenience support.

est31 commented 11 months ago

That is wonderful news @Amanieu! I think the three things missing are:

PR to split the feature off -- @GKFX has indicated they can work on this
stabilization PR
stabilization report: done
stabilization FCP

If we want to get this into 1.76.0, the most time intensive concern is the stabilization FCP which needs 10 days. Either we can re-use the FCP in this issue, or we can issue a new one. I would prefer latter, as it is cleaner. Maybe rfcbot cancel followed by rfcbot merge are the right commands? I will write a stabilization report shortly.

est31 commented 11 months ago

Feature Summary

This proposes stabilization of the offset_of macro restricted to non-nested uses:

struct FieldStruct {
    first: u8,
    second: u16,
    third: u8
}

assert_eq!(std::mem::offset_of!(FieldStruct, first), 0);
assert_eq!(std::mem::offset_of!(FieldStruct, second), 2);
assert_eq!(std::mem::offset_of!(FieldStruct, third), 4);

Nested uses will for now still require a feature gate:

#![feature(offset_of_nested)] // precise feature name still TBD

#[repr(C)]
struct NestedA {
    b: NestedB
}

#[repr(C)]
struct NestedB(u8);

assert_eq!(mem::offset_of!(NestedA, b.0), 0);
offset_of!()

Regarding the stability of the output, we have this section in the rustdoc level docs:

Note that type layout is, in general, subject to change and platform-specific. If layout stability is required, consider using an explicit repr attribute.

Documentation

Like most library features, it has a rustdoc-level documentation.

Tests

The feature is well tested, both in the testsuite, and in the ecosystem.

Unresolved questions

The syntax for nesting and enums is still under debate. Enums already live under their own feature gate since #117537, and nesting will be split off prior to or inside the stabilization PR.

joshlf commented 11 months ago

Thanks for writing this up!

Except for #[repr(C)] types, the output is documented to be not stable.

This isn't sufficient - in some cases, other reprs such as transparent or packed are sufficient to fully guarantee a type's field offsets.

We should probably also clarify: "...not to be stable across multiple compilations of the same program." Within a given compilation, the output is stable even without a repr.

GKFX commented 11 months ago

We should probably also clarify: "...not to be stable across multiple compilations of the same program." Within a given compilation, the output is stable even without a repr.

This has been re-worded a couple of times in the library documentation and I believe the documentation there is now adequate; it includes or links out to the information you mention.

Would it be OK to stabilize this in the form below?

#[stable]
pub macro offset_of {
    ($Container:ty, $field:ident $(,)?) => /* stable */,
    ($Container:ty, $($fields:tt).+ $(,)?) => /* unstable */,
}

I would have some concern about making the $($fields:tt).+ form usable on stable since it looks like it will need to change and I don't want there to be unintended breakage when the macro is called from other macros etc. and the parameter types then change.

the8472 commented 11 months ago

Ident wouldn't support the tuple field accesses mentioned in https://github.com/rust-lang/rust/issues/106655#issuecomment-1793844266

est31 commented 11 months ago

I don't want there to be unintended breakage when the macro is called from other macros etc. and the parameter types then change.

While yes, there is such a rule, it apparently doesn't apply for ident becoming a tt. This compiles:

// This would be the external macro calling offset_of
macro_rules! foo {
    ($l:ident) => { bar!($l); }
}

// This would be the offset_of macro in core/std
macro_rules! bar {
    ($c:tt) => {}
}

foo!(hi);

Good point though, I wasn't aware of this rule. This is the reference section.

My suggestion for the split-off would be after the macro, inside the builtin syntax to see if fields has more than one item. If yes, and there is no appropriate feature gate, there would be an error.

est31 commented 11 months ago

This has been re-worded a couple of times in the library documentation and I believe the documentation there is now adequate; it includes or links out to the information you mention.

Thanks for pointing that out, I have updated the stabilization comment.

RalfJung commented 11 months ago

If we want to get this into 1.76.0,

We do time-based released, not feature-based releases, precisely to avoid any kind of "deadline rush". Please don't rush this. It's no problem at all if it takes 6 weeks longer to ship this, so I'd rather we take the time to do this properly. :)

E.g. we need to be reasonably confident that whatever syntax we ship is forward-compatible for extending this to multiple fields later.

est31 commented 11 months ago

At least I am quite confident that the existing syntax is forward-compatible for multiple fields, whether . or , or /, or ->, or :: is used to separate fields, or just spaces. I have pointed to the place above where I think it's easiest to lock things down. I am familiar with the parsing code as I have implemented it, and I think it's extremely easy to restrict it to single fields.

It is not my intent to rush the feature, and please don't misunderstand me that it is an absolute must that it has to be released at that date: it isn't.

But I have made the experience that light nudges towards a target date are helpful in getting a feature over the finishing line, even if it misses a release in the end. It puts people into the right mindset of looking for issues and wondering if they are big enough to block stabilization or not.

Sure it gives more scrutiny if you have six weeks of a feature sitting on nightly as "to be stabilized soon" until it actually goes to beta, but that is actually quite close to a feature being stabilized shortly before a release, it's just a few days offset :).

When humans collaborate, there is the need for coordination, which also involves when to work on something. Mere waiting time is not helpful on its own, as a feature's issues and problems and their solution are only discovered from attention. Sometimes that attention only appears once a feature is on stable, which is not as optimal when attention appears to nightly features. I'd argue that for offset_of, we had plenty of time at this point. People have brought up stabilization weeks after the PR merged: the feature is quite wanted.

It happens often that features get stuck in a "final touches missing" hell, and can be there for months or even years. Not because the final touches are insurmountable, but because they need people's attention and nobody is confident enough to step up and say "let's stabilize this soon". I only push towards stabilization now because of the questions about the remaining blockers earlier in the thread, and because it indeed feels like that the feature is ready.

I think people already apply a fair amount of scrutiny for unstable features, and this time around I put extra emphasis on tests and I have asked for testing inside the ecosystem for offset_of, and people did report their experiences with it in addition to some of my own experiments.

In this instance, I don't believe there is any stabilization blockers left (outside of the split of the feature into two). This doesn't mean that this is an unchangeable fact, it can actually be that there is concerns. So if anyone has concrete issues, please bring them up. If any concern grave enough comes up, it is absolutely important to delay the stabilization until it is resolved or worked around (eg by limiting the stable parts of the feature).

RalfJung commented 11 months ago

At least I am quite confident that the existing syntax is forward-compatible for multiple fields, whether . or , or /, or ->, or :: is used to separate fields, or just spaces. I have pointed to the place above where I think it's easiest to lock things down. I am familiar with the parsing code as I have implemented it, and I think it's extremely easy to restrict it to single fields.

Okay, so it is very unlikely that we'll end up with offset_of!(Type.field.field) or so, something other than a , between giving the type and giving the field inside the type?

CryZe commented 11 months ago

I'm sorry if I've missed it, but at least a CTRL+F "pattern" didn't turn up anything, but I'm really surprised by this issue why we are not just using the basic pattern matching syntax? That resolves all cases including deeply nested stuff, enums and co., while also not needing to make up any new way of expressing these things. Also in my opinion, if we were to go with pattern matching syntax, then it wouldn't be super compatible with the offset_of!(Type, field) that is intended to be stabilized. Or at least there would be two ways of expressing the same thing then, with the , syntax possibly being deprecated right away.

est31 commented 11 months ago

The current-on-nightly syntax has precedent in c's offsetof (standard link), including the way dots work for nesting, so it has some familiarity I would presume.

But this doesn't mean that we can't do offset_of!(Type.field.field) if we really wanted to.

Regarding pattern syntax, I wonder how arrays would be represented, so what would be the pattern analog of c's offsetof(S, field.field[20].10)?

est31 commented 11 months ago

IDK maybe I made the wrong estimate and it's too early still for a stabilization of offset_of before the syntax discussion advances further. But my fear is that this can drag on for months and years.

GKFX commented 11 months ago

I am thinking that offset_of!(Type, self.a.b.0[n]) might be a good choice - it's a single expr which the compiler already knows how to parse, with no edge cases around 0.0 etc.

To summarise, the main options that I'm aware of and my views on them are:

Type.a.b - not a syntax currently used elsewhere so would need custom parsing, breaks the macro follow-set restrictions if matching the type with ty.
C-style - current implementation but has issues with consecutive tuple accesses and whitespace not matching the current metavariables. Needs some quite awkward parsing because of that.
Pattern-based - significantly more verbose, no obvious method for array indexing, but is a pre-existing syntax so no parser work
self.a.b - slightly more verbose version of C-style. Also should need no parser work.

the8472 commented 11 months ago

You're forgetting enums

est31 commented 11 months ago

@CryZe btw, patterns have come up in the syntax discussion on zulip.

Maybe it would be best to move this discussion there? Or a dedicated thread elswhere. This is a tracking issue after all :).

RalfJung commented 11 months ago

Regarding pattern syntax, I wonder how arrays would be represented, so what would be the pattern analog of c's offsetof(S, field.field[20].10)?

offset_of!(S.field.field[20].10). (C does not have .10 though, is this a Rust tuple field access?)

I don't think I understand what the pattern syntax would look like, could someone give some examples? offset_of!(Struct { field1, field2 }) makes no sense so surely it'd have to be a very restricted subset of patterns, and rather redundant if one has to always add the ..?

est31 commented 10 months ago

surely it'd have to be a very restricted subset of patterns, and rather redundant if one has to always add the ..?

Yeah the zulip thread proposes usage of @ for that.

C does not have .10 though, is this a Rust tuple field access?

Yes, sorry I forgot. replace that .10 with .ten :).

est31 commented 9 months ago

As #118799 has been merged, non-nested offset_of!() is now stable on the master branch and will likely be part of the 1.77.0 stable release on March 21, 2024. For details of what was stabilized, see the stabilization report.

I'm closing this tracking issue as the still unstable aspects of the offset_of macro now have different tracking issues:

Thanks everyone involved in getting offset_of specified, implemented, tested and stabilized.

rust-lang / rust