Open aarkegz opened 1 month ago
@stepancheg I'll discuss these methods here.
Default implementation are costly. Generating them automatically for structs and enums in proc_macro is possible and not likely to be a problem
Right.
But on top of that, we can changeEnumerator
to be simple type, universal for all enumerable types, which just calls enumerable_from_index
struct Enumerator<E: Enumerable> { index: usize }
The advantages:
Enumerator
no longer needs to generated, simpler API, simpler implementationDoubleEndedIterator
(also operations like skip(n)
of size_hint()
will be fast)
DoubleEndedIterator
on top of current implementation, but this is tricky)The disadvantages:
usize
Not clear:
Enumerable
type, iteration can be faster or slower. For iteration over tuples, it division by constant, very fast. For deeply nested enums it may be slowerBut on top of that, we can change
Enumerator
to be simple type, universal for all enumerable types, which just callsenumerable_from_index
Yes it's a quite good idea.
- won't work for types larger than
usize
Maybe we should introduce a new trait named something like RandomlyAccessibleEnumerable
(it's a bad name though) for types with less than usize::MAX
possible values? Then use struct Enumerator
to implement Enumerable
arbitrarily. These giant types can stay on current implementations.
- dependending on
Enumerable
type, iteration can be faster or slower. For iteration over tuples, it division by constant, very fast. For deeply nested enums it may be slower
Yes. But intuitively speaking, it should be really deep to be noticeably slower.
Maybe we should introduce a new trait named something like
RandomlyAccessibleEnumerable
(it's a bad name though) for types with less thanusize::MAX
possible values?
I'd like put it to v2.0. And I think #26 could be included in v1.0 and released earlier.
I'd like put it to v2.0. And I think #26 could be included in v1.0 and released earlier.
It strikes me that ENUMERABLE_SIZE
should be an associate constant of RandomlyAccessibleEnumerable
instead of Enumerable
immediately after I merged #17 😂. Maybe it should be reverted later.
RandomlyAccessibleEnumerable
Isn't it overkill?
Also, is there a case where we'd want an enumerable of size larger than usize
? I think we can say that Enumerable
trait only works for types that have not greater than usize::MAX
elements.
I did actually only support smaller types in my implementation.
Isn't it overkill?
Maybe.
I did actually only support smaller types in my implementation.
Same for me. I was using it on small types and haven't considered these problems before.
I think we can say that
Enumerable
trait only works for types that have not greater thanusize::MAX
elements.
However, it's not always possible to say that a type have more or less than usize::MAX
possible values, especially the generic ones like Result or tuples. So I think it's best to make Enumerable
work best with types that have no more than usize::MAX
possible values, while still compatible with types that have more.
My plan now:
const Enumerable::ENUMERABLE_SIZE : Option<usize>
: Some(size)
for types with usize::MAX
or less possible values and None
for others. I think it's better than just panic.Enumerable::enumerator_since(index: usize) -> Self::Enumerator
, Enumerable::enumerable_from_index(index: usize) -> Self
: efficient built-in implementation and #[derive]
implementation based on Enumerable::ENUMERABLE_SIZE. Simple and stupid default implementation provided.Enumerable::enumerable_index(value: Self) -> usize/Option<usize>
: same as above.const Enumerable::ENUMERABLE_SIZE : Option
: Some(size) for types with usize::MAX or less possible values and None for others. I think it's better than just panic
It maybe actually not be better. Because now if the size is larger then usize::MAX
you get clean compilation error. And with proposed change, you get the same error, but later, in runtime (because what else you can possibly to with Option
but unwrap()
?)
Another argument is this: there's no practical reason as far as I see to have enumerable size larger than usize::MAX
.
a) Enumerable
is useful because it can be mapped to usize
to be useful as index (e.g. to implement constant time lookup by Enumerable
instance; that was my plan to submit next to this crate). If a type is larger than usize::MAX
, it won't be useful in this context.
b) you won't be able to enumerate it anyway in reasonable time, even if you for some reason just want to enumerate it, not use as lookup key. Just enumerating usize on 64-bit machine would take hundreds of years.
Enumerable::enumerator_since(index: usize)
enumerator
without argument is what users want in 99.99% cases, and if a user needs since
, a user can just call skip
on the enumerator iterator.
Simple and stupid default implementation provided
I think Enumerable trait should have only three members: from index, to index, and size. All three must be efficient (thus properly generated), otherwise they would be dangerously unhelpful.
And if all three provided, there's no need to have complex enumerator implementation: the only implementation can be based on "from index".
I think Enumerable trait should have only three members: from index, to index, and size. All three must be efficient (thus properly generated)
I agree. What I disagree is giving up compatibility for seemingly larger types. It makes perfect sense to me that a user want to enumerate not all but the first thousands or millions elements of (usize, bool)
or something like that. And a compilation error is completely unsolvable and unavoidable here if we just keep from_index, to_index, and size. Also, randomly sampling on these larger types are useful and possible and I'm planning on this.
What I want and proposed is efficiency with compatibility. Fast version of Enumerable::enumerable_from_index
and Enumerable::enumerable_index
for built-in implementations and derived implementations(most of possible types actually) as you want, and slower but backward-compatible version for others.
And if all three provided, there's no need to have complex enumerator implementation: the only implementation can be based on "from index".
That's true, and that's fast for primitives and plain enums. But the question os, such enumerators may be not as efficient as the complex ones when enumerating structs and enums with fields.
enumerator
without argument is what users want in 99.99% cases, and if a user needssince
, a user can just callskip
on the enumerator iterator.
It's true for index-based enumerator. For classical ones, skip
need since
.
There isn't any practical reasons to enumerate all possible values of a large type, but there are reasons to enumerate or sample a reasonable number of it. That's why I wouldn't just give them up.
you get the same error, but later, in runtime (because what else you can possibly to with
Option
butunwrap()
?)
I think it's acceptable because it will crash very early if a user unwrap
it. Careful users (and authors) can match/is_some()
/is_none()
in a const context.
What I want and proposed is efficiency
It is not clear that iterating with the way iteration is implemented now is faster than iteration by manipulating indices. May be quite the opposite. Because, for example, division and subtraction is fast, and branches are not.
with compatibility
There's not many users of enumerable
crates yet, and it is not 1.0 version to worry about backwards compatibility.
It makes perfect sense to me that a user want to enumerate not all but the first thousands or millions elements of
(usize, bool)
or something like that
I cannot see any practical reasons to do that. Can you give an example, why someone would want that?
I think it's acceptable because it will crash very early if a user unwrap it. Careful users (and authors) can match/is_some()/is_none() in a const context.
As I see it, extra code verbosity for no practical gain.
I think I misunderstood your meaning of enumerable
crate, which is quite different from what I expected. I should probably publish my version separately then, because this one crate may not fit my needs.
I think I misunderstood your meaning of
enumerable
crate, which is quite different from what I expected. I should probably publish my version separately then, because this one crate may not fit my needs.
Maybe what we expect are very different. I'll continue developing this crate with my plan and I hope you can kindly allow me to use the interface proposed by you. Feel free to use my code in any way you like, if it's helpful for you to write your version, as the license has already confirmed. Thanks for your help. Hope we can collaborate again in the future.
Two methods (maybe with default implementations provided) to map index from/to enumerated value.