rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.96k stars 12.53k forks source link

Tracking Issue for `core::mem::variant_count` #73662

Open doctorn opened 4 years ago

doctorn commented 4 years ago

The feature gate for the issue is #![feature(variant_count)].

Steps

Unresolved Questions

None

Implementation history

avl commented 4 years ago

@RalfJung Responding to your response here instead, as per kennytm:s request.

The reason I asked about if it returned "largest discriminant + 1" is because such a behavior would be kind of reasonable with your motivation, about using the enum as an index into a Vec. There could be other situations where knowing the maximum discriminant value could actually be more important than knowing the number of elements. That said, maybe there should be two functions, one to get the maximum discriminant and one to get the number of elements. Or maybe the number of variants really is the only thing anyone ever cares about. Just wanted to bring up the question :-) .

The other point I was making was that perhaps an obligatory u64 is not necessary for the count. Other counts in rust use usize, and though I understand that you could in theory have 2^40 enum variants on a 32 bit machine, I don't think the compiler can really compile this even on a 64 bit machine right now. Also, I think having 2^40 enum variants on a 32-bit machine is sufficiently 'weird' that it seems reasonable for some friction to occur. I personally cannot imagine a scenario where an application would need more enum variants than the number of addressable bytes on the platform, though of course the fact I can't imagine it doesn't mean it doesn't exist :-) .

Really, there are some problems which are too big for some machines. As an example, a problem requiring a 100MB lookup-table is simply too big to process on a 16 bit machine. My argument is that a problem requiring a 2^40 variant enum is simply too big to process on a 32-bit machine. That said, a reasonable person could certainly disagree here!

Knowing the largest discriminant of an enum could be useful in these cases:

RalfJung commented 4 years ago

I see, thanks.

I personally have no skin in this game, I doubt I will ever use that API.^^ I was just involved in reviewing. @doctorn would be the one to ask about motivation. :)

Flying-Toast commented 3 years ago

We should probably add a deny lint to rustc that prevents calling variant_count::<T>() where T is not an enum.

Regarding the API itself though, IMO it feels a bit janky:

What would be the drawbacks to doing this with a trait?

trait Enum {
    type Repr;
    /// Returns the maximum possible index into an array containing every variant of this enum.
    /// Returns `None` if the enum has no variants.
    const fn max_variant() -> Option<Repr>;
}

The compiler could automatically implement this trait for all enums, and it would be used like:

enum Foo { /* ... */ }

<Foo as core::intrinsics::Enum>::max_variant()
Flying-Toast commented 3 years ago

RE: @doctorn's comment in 73418:

you can’t use <T as DiscriminantKind>::Discriminant because if I have an enum with 256 variants the discriminant type will be u8 which can’t represent 256.

We could also use a similar API like my previous comment, but without a trait:

/// Returns the maximum possible index into an array containing every variant of the enum `T`.
/// Returns `None` if the enum has no variants.
fn max_variant<T>() -> Option<<T as DiscriminantKind>::Discriminant>

Combined with a rustc lint for non-enum Ts, I think that would get rid of all unspecified behavior. Thoughts?

doctorn commented 3 years ago

Honestly the Enum trait sounded better to me. I much prefer having the number of variants returned rather than the maximum repr.

Flying-Toast commented 3 years ago

Honestly the Enum trait sounded better to me. I much prefer having the number of variants returned rather than the maximum repr.

Ok.

One more thing - the discriminant_value intrinsic seems to use u64s for discriminants; maybe variant_count should return a u64 to be consistent with that?

lcnr commented 3 years ago

the discriminant_value intrinsic seems to use u64s for discriminants;

https://doc.rust-lang.org/nightly/std/intrinsics/fn.discriminant_value.html

it does not, the docs are outdated on stable though '^^

BusyJay commented 2 years ago

Should this function be a const function?

johnkjellberg commented 2 years ago

Should this function be a const function?

The way I want to use this require it to be const. And I don't really see why it shouldn't?

doctorn commented 2 years ago

@BusyJay I believe this function is const, but the docs incorrectly omit this

RalfJung commented 2 years ago

The docs mention constness in the top-right corner: const: [unstable](https://github.com/rust-lang/rust/issues/73662).

const fn is only printed for functions whose constness is stable.

joshtriplett commented 2 years ago

We discussed this in today's @rust-lang/lang meeting.

This seems like it fits in with a broader story of enums, including things like AsRepr.

In the absence of those, we're not sure if we want to have this function in its current state.

EriKWDev commented 1 year ago

I use this all the time.

I love having enums represent all the variants I want and then storing data for them in arrays which requires me to know the number of them in compile time.

#[repr(usize)]
enum MyKind {
    Kind1,
    Kind2,
}

// This constant gets outdated if the number of variants change...
pub const NUM_MY_KIND: usize = 2;

I make game engines and games and this is a demonstration of a bit of data-oriented-ness.

struct KindData1 {/* ... */}
struct KindData2 {/* ... */}
struct KindData3 {/* ... */}

struct LargeData1 {/* ... */}
struct LargeData2 {/* ... */}
struct LargeData3 {/* ... */}

struct Data {
    //...
    base_1: [KindData1; NUM_MY_KIND],
    base_2: [KindData2; NUM_MY_KIND],
    base_3: [KindData3; NUM_MY_KIND],

    large_data_1: Vec<LargeData1>,
    large_data_2: Vec<LargeData2>,
    large_data_3: Vec<LargeData3>,

    // These are the same length
    kinds: Vec<MyKind>,
    flags_1: Vec<[Option<usize>; NUM_MY_KIND]>,
    flags_2: Vec<[Option<usize>; NUM_MY_KIND]>,
    flags_3: Vec<[Option<usize>; NUM_MY_KIND]>,
}

fn func(data: &Data) {
    // Operate on data common to all kinds
    for &kind in &data.kinds {
        let i = kind as usize;
        let (d1, d2) = (&mut data.base_1[i], &mut data.base_2[i]);
        // ..
    }
    for &kind in &data.kinds {
        let i = kind as usize;
        let (d2, d3) = (&mut data.base_2[i], &mut data.base_3[i]);
        // ..
    }

    assert!(data.kinds.len() == data.flags_1.len());
    assert!(data.flags_2.len() == data.flags_3.len());
    for (index, &kind) in data.kinds.iter().enumerate() {
        let i = kind as usize;

        // Operate on permutations of having and not having data
        if let (Some(id1), Some(id2)) = (data.flags_1[index][i], data.flags_2[index][i]) {
            let base1 = &data.base_1[i];
            let base2 = &data.base_2[i];
            let (d1, d2) = (&mut data.large_data_1[id1], &mut data.large_data_2[id2]);
            // ...
        }
    }

    // Operate on all data. I want this to be packed in memory
    for data_1 in &mut data.large_data_1 {
        // ..
    }
    for data_2 in &mut data.large_data_2 {
        // ..
    }

    // ..
}

I have data that is dependent on the kind but not unique per instance of the kind. This way I don't have to do a match at each location and getting the permutations of per-kind data is a simple lookup in arrays.

The other usage I have in some places is where I maybe have unique (small, usually just an id) data per instance of a kind, and I like representing that as a Vec<[Option<usize>; NUM_KIND]>. This allows me to store the data for all kinds packed in a Vec and then use the sparse Vec<[Option<usize>; NUM_KINDS]> to point back into the packed Vecs. Kind of like an ECS


The constant that keeps track of the number of kinds would be very nice to be able to get without setting manually :)

(I have tried to add a 'Length' member to the enum (const NUM: usize = MyEnum::Length as usize), but that gets very annoying at places where you do a match since it has to be ignored all the time..)

joshtriplett commented 7 months ago

Nominating this for lang, because empirically we've stalled out on a potentially useful library addition on the basis of proposed lang features, which themselves seem to have stalled out.

I'd love to see AsRepr if someone wants to pick that back up and get it over the finish line. But also, this might be a reasonable interim step, and it doesn't seem like it does any harm to have it even if we add better solutions in the future.

MarinPostma commented 6 months ago

@joshtriplett I'd like to pick this up, if someone can point me in the right direction

scottmcm commented 6 months ago

I volunteered to explore something here. Initial idea thread: https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/.60enum.23.60.20magic/near/425164499

traviscross commented 6 months ago

@rustbot labels -I-lang-nominated

We discussed this in the meeting today, and as above, scottmcm volunteered to make a proposal here and we decided to wait for that. So let's unnominate this for now, but please nominate the proposal, wherever it is made, for us.

Supreeeme commented 3 weeks ago

Any updates on this?

traviscross commented 3 weeks ago

The thread to track is: