Map index from/to enumerated value.

aarkegz commented 1 month ago

Two methods (maybe with default implementations provided) to map index from/to enumerated value.

println!("{}", <i8 as Enumerable>::enumerable_index(-127).unwrap()); // "1", as -127 is the 1st (0-based) possible value of i8
println!("{}", <i8 as Enumerable>::enumerable_from_index(4).unwrap()); // "-124"

aarkegz commented 1 month ago

@stepancheg I'll discuss these methods here.

[ ] Their names are OK, but are there better ones?
[ ] Default implementation are costly.
[x] Generating them automatically for structs and enums in proc_macro is possible and not likely to be a problem.

stepancheg commented 1 month ago

Default implementation are costly. Generating them automatically for structs and enums in proc_macro is possible and not likely to be a problem

Right.

But on top of that, we can changeEnumerator to be simple type, universal for all enumerable types, which just calls enumerable_from_index

struct Enumerator<E: Enumerable> { index: usize }

The advantages:

Enumerator no longer needs to generated, simpler API, simpler implementation
will work as DoubleEndedIterator (also operations like skip(n) of size_hint() will be fast)
- (although technically we can implement DoubleEndedIterator on top of current implementation, but this is tricky)

The disadvantages:

won't work for types larger than usize

Not clear:

dependending on Enumerable type, iteration can be faster or slower. For iteration over tuples, it division by constant, very fast. For deeply nested enums it may be slower

aarkegz commented 1 month ago

But on top of that, we can changeEnumerator to be simple type, universal for all enumerable types, which just calls enumerable_from_index

Yes it's a quite good idea.

won't work for types larger than usize

Maybe we should introduce a new trait named something like RandomlyAccessibleEnumerable(it's a bad name though) for types with less than usize::MAX possible values? Then use struct Enumerator to implement Enumerable arbitrarily. These giant types can stay on current implementations.

dependending on Enumerable type, iteration can be faster or slower. For iteration over tuples, it division by constant, very fast. For deeply nested enums it may be slower

Yes. But intuitively speaking, it should be really deep to be noticeably slower.

aarkegz commented 1 month ago

Maybe we should introduce a new trait named something like RandomlyAccessibleEnumerable(it's a bad name though) for types with less than usize::MAX possible values?

I'd like put it to v2.0. And I think #26 could be included in v1.0 and released earlier.

aarkegz commented 1 month ago

I'd like put it to v2.0. And I think #26 could be included in v1.0 and released earlier.

It strikes me that ENUMERABLE_SIZE should be an associate constant of RandomlyAccessibleEnumerable instead of Enumerable immediately after I merged #17 😂. Maybe it should be reverted later.

stepancheg commented 1 month ago

RandomlyAccessibleEnumerable

Isn't it overkill?

Also, is there a case where we'd want an enumerable of size larger than usize? I think we can say that Enumerable trait only works for types that have not greater than usize::MAX elements.

I did actually only support smaller types in my implementation.

aarkegz commented 1 month ago

Isn't it overkill?

Maybe.

I did actually only support smaller types in my implementation.

Same for me. I was using it on small types and haven't considered these problems before.

I think we can say that Enumerable trait only works for types that have not greater than usize::MAX elements.

However, it's not always possible to say that a type have more or less than usize::MAX possible values, especially the generic ones like Result or tuples. So I think it's best to make Enumerable work best with types that have no more than usize::MAX possible values, while still compatible with types that have more.

My plan now:

const Enumerable::ENUMERABLE_SIZE : Option<usize>: Some(size) for types with usize::MAX or less possible values and None for others. I think it's better than just panic.
Enumerable::enumerator_since(index: usize) -> Self::Enumerator, Enumerable::enumerable_from_index(index: usize) -> Self: efficient built-in implementation and #[derive] implementation based on Enumerable::ENUMERABLE_SIZE. Simple and stupid default implementation provided.
Enumerable::enumerable_index(value: Self) -> usize/Option<usize>: same as above.

stepancheg commented 1 month ago

const Enumerable::ENUMERABLE_SIZE : Option: Some(size) for types with usize::MAX or less possible values and None for others. I think it's better than just panic

It maybe actually not be better. Because now if the size is larger then usize::MAX you get clean compilation error. And with proposed change, you get the same error, but later, in runtime (because what else you can possibly to with Option but unwrap()?)

Another argument is this: there's no practical reason as far as I see to have enumerable size larger than usize::MAX.

a) Enumerable is useful because it can be mapped to usize to be useful as index (e.g. to implement constant time lookup by Enumerable instance; that was my plan to submit next to this crate). If a type is larger than usize::MAX, it won't be useful in this context.

b) you won't be able to enumerate it anyway in reasonable time, even if you for some reason just want to enumerate it, not use as lookup key. Just enumerating usize on 64-bit machine would take hundreds of years.

Enumerable::enumerator_since(index: usize)

enumerator without argument is what users want in 99.99% cases, and if a user needs since, a user can just call skip on the enumerator iterator.

Simple and stupid default implementation provided

I think Enumerable trait should have only three members: from index, to index, and size. All three must be efficient (thus properly generated), otherwise they would be dangerously unhelpful.

And if all three provided, there's no need to have complex enumerator implementation: the only implementation can be based on "from index".

aarkegz commented 1 month ago

I think Enumerable trait should have only three members: from index, to index, and size. All three must be efficient (thus properly generated)

I agree. What I disagree is giving up compatibility for seemingly larger types. It makes perfect sense to me that a user want to enumerate not all but the first thousands or millions elements of (usize, bool) or something like that. And a compilation error is completely unsolvable and unavoidable here if we just keep from_index, to_index, and size. Also, randomly sampling on these larger types are useful and possible and I'm planning on this.

What I want and proposed is efficiency with compatibility. Fast version of Enumerable::enumerable_from_index and Enumerable::enumerable_index for built-in implementations and derived implementations(most of possible types actually) as you want, and slower but backward-compatible version for others.

And if all three provided, there's no need to have complex enumerator implementation: the only implementation can be based on "from index".

That's true, and that's fast for primitives and plain enums. But the question os, such enumerators may be not as efficient as the complex ones when enumerating structs and enums with fields.

enumerator without argument is what users want in 99.99% cases, and if a user needs since, a user can just call skip on the enumerator iterator.

It's true for index-based enumerator. For classical ones, skip need since.

There isn't any practical reasons to enumerate all possible values of a large type, but there are reasons to enumerate or sample a reasonable number of it. That's why I wouldn't just give them up.

aarkegz commented 1 month ago

you get the same error, but later, in runtime (because what else you can possibly to with Option but unwrap()?)

I think it's acceptable because it will crash very early if a user unwrap it. Careful users (and authors) can match/is_some()/is_none() in a const context.

stepancheg commented 1 month ago

What I want and proposed is efficiency

It is not clear that iterating with the way iteration is implemented now is faster than iteration by manipulating indices. May be quite the opposite. Because, for example, division and subtraction is fast, and branches are not.

with compatibility

There's not many users of enumerable crates yet, and it is not 1.0 version to worry about backwards compatibility.

It makes perfect sense to me that a user want to enumerate not all but the first thousands or millions elements of (usize, bool) or something like that

I cannot see any practical reasons to do that. Can you give an example, why someone would want that?

I think it's acceptable because it will crash very early if a user unwrap it. Careful users (and authors) can match/is_some()/is_none() in a const context.

As I see it, extra code verbosity for no practical gain.

I think I misunderstood your meaning of enumerable crate, which is quite different from what I expected. I should probably publish my version separately then, because this one crate may not fit my needs.

aarkegz commented 1 month ago

I think I misunderstood your meaning of enumerable crate, which is quite different from what I expected. I should probably publish my version separately then, because this one crate may not fit my needs.

Maybe what we expect are very different. I'll continue developing this crate with my plan and I hope you can kindly allow me to use the interface proposed by you. Feel free to use my code in any way you like, if it's helpful for you to write your version, as the license has already confirmed. Thanks for your help. Hope we can collaborate again in the future.

GeminiLab / enumerable

Map index from/to enumerated value. #27