Open WalterSmuts opened 2 years ago
Thanks for your offer and your input. I'd like to accept your offer.
You make a very good point about making the API more readable. Honestly, I never liked the current style with sub_array_ref::<LEN>(offset)
since it is kinda hard to read. I am also very interested in adding compile-time constraints to prevent as many panics as possible at compile time (I supposing that is your second suggestion).
Given your comment on slices, I'd like to clarify that (as I meant it) this crate is about obtaining a "sub-array" not working on "sub-array"s. And thus, it is quite sensible (in my opinion) to extract a "sub-array" from a slice, tho it is true that this operation can't be checked at compile time and thus generally may panic at run-time. Nevertheless, I would say that slices as a source should be at most of secondary concern, but I would rather keep them.
Anyway, as far as I understand, this is concisely what you are proposing:
Range
-like syntax (i.e. from..to
)Index
-operator (i.e. implementing that trait)At first glance, I supposed all of these enhancements would require the Nightly generic_const_exprs
feature (what you wrote also sounded a bit like it). However, you showed (via your playground code) that (1) can be entirely implemented in stable Rust, and only (2) requires the generic_const_exprs
feature.
Thus, I consider the following plan:
Separating those two points will allow users to keep using this crate with a stable compiler, tho without compile-time constraints, which at least is consistent with the current implementation, at least they get a nicer API. And for those who can afford to use Nightly, they get the compile-time constraints as well.
Some further notes:
Of course, I also looked into the playground example code that you kindly provided. You did there some nice ground work. Some points which I found can be improved/fixed:
Index
directly or Rust's arrays.marco_rules
to implement the range!(2..5)
syntax.Here a refined version of your code, that fixes the above.
In terms of a future API there is the aforementioned panicing variant, that should work with stable Rust and may work with slices, which I imaging to look something like this:
// The trait providing the API
use sub_array::CheckedSubArray;
use sub_array::range;
let slice: &[u8] = &[1,2,3,4,5];
// works
let sub_array: &[u8;2] = slice.checked_sub_array_ref(range!(3..5)).unwrap();
assert_eq!(sub_array, [4,5]);
// compiles, but panics
let sub_array: &[u8;2] = slice.checked_sub_array_ref(range!(4..6)).unwrap();
Basically the current API, (mostly) same semantics, same source types (arrays work as well), but with more verbose names to make "space" for the constraint version. As you can see, I began to like a "checked" version that returns an Option
better than directly panicing, it's more like a spontaneous idea, but it makes it easier to find a name for it.
And then the Nightly feature-gated compile-time enforce variant:
#![feature(generic_const_exprs)]
// The trait providing the API
use sub_array::SubArray;
use sub_array::range;
let array: &[u8; 5] = &[1,2,3,4,5];
// works
let sub_array: &[u8;2] = array.sub_array_ref(range!(3..5));
assert_eq!(sub_array, [4,5]);
// wouldn't even compile
//let sub_array: &[u8;2] = array.sub_array_ref(range!(4..6));
And of course, the Nightly variant would also allow indexing with the same semantics as above:
#![feature(generic_const_exprs)]
// (Const)Range is all we need
use sub_array::range;
let array: &[u8; 5] = &[1,2,3,4,5];
// works
let sub_array: &[u8;2] = &array[range!(3..5)];
assert_eq!(sub_array, [4,5]);
// wouldn't even compile
//let sub_array: &[u8;2] = &array[range!(4..6)];
What do you think about all that?
Wow! You took the idea and ran with it! Nice :)
Basically each of your comments I agree with. Nice work on making it all just BETTER.
We don't need a wrapper struct, we can implement Index directly or Rust's arrays.
This is probably the most glaringly obvious thing I overlooked. Well spotted! That makes it all much nicer.
Given your comment on slices, I'd like to clarify that (as I meant it) this crate is about obtaining a "sub-array" not working on "sub-array"s. And thus, it is quite sensible (in my opinion) to extract a "sub-array" from a slice, tho it is true that this operation can't be checked at compile time and thus generally may panic at run-time. Nevertheless, I would say that slices as a source should be at most of secondary concern, but I would rather keep them.
Yep, you're 100% right. For some reason I'm in a "const check everything" mindset, and it's easy to overlook other use-cases. There is no reason not to extend the API to slices too. Nice!
Some further notes:
- When reading the above, one might get the impression, that the crate feature will change the semantics of the API, which in my opinion should be prevented. Thus, I propose two different APIs (or traits more specifically), one with panicing semantics and another that has compile-time constraints and is practically panicing-free, the latter one will be only available with the said crate-feature and a Nightly compiler.
- I'd like to push users to use the compile-time constraint version (if available). Thus, the compile-time constraint version should get the simpler and more obvious name. Also, I suppose using the indexing operator is the most obvious usage pattern, and thus should have the compile-time constraints (tho that will make it only available on Nightly).
Mmm... I think I understand your concern. If we had exactly the same API and all that adding the const_checked
feature would change is to add some compile time constraint checks, then we could turn a compile-able program into a non-compile-able program. So turning the feature on would essentially be a major version bump. This would be especially concerning when the generic_const_exprs
feature becomes stable and we'd want to remove the feature or perhaps set it as default
. Then we'd be breaking folks's programs if we don't do a major version bump.
The strange situation is that we'd only be breaking programs that would DEFINITELY panic. Although, only if it hits that case at runtime, which it may not. Not sure how strict semver is in this situation, but IMO this may be a grey area. The alternate is just to do a major version bump when that upgrade happens. Then we don't break programs but allow for the more ergonomic API in the current panic-able version. So I guess the question is: "Which is more important, not having a major version bump, or having an ergonomic API?"
Of course, I also looked into the playground example code that you kindly provided. You did there some nice ground work. Some points which I found can be improved/fixed
All of them are better! NICE!
In terms of a future API there is the aforementioned panicing variant, that should work with stable Rust and may work with slices, which I imaging to look something like this:
I like the idea of a checked variant :+1: But I do think that the unchecked variant would be useful too (You've completely convinced me of the slice use-case :P). So the unchecked variant could use the slice[start...end]
notation and internally call the unchecked variant and just unwrap()
or expect("Wrong size")
the result.
A note on naming: I'm not sure the ref
suffix is standard. I understand you're obtaining a reference to the sub-array, but looking at the rest of the naming in the standard library it's not explicit. For example index and index_mut has no ref
suffix for the immutable reference variant, even though index
returns a reference. There are many other examples in the naming used in this way just in the slice documentation. I do see some examples of the ref
pattern though. Since we're working with the Index
trait I'd suggest we follow that convention, although suddenly I'm not so sure of myself anymore.
Anyway, this is exciting! Thanks for being open to collaboration! I think this would turn into an extremely useful crate!
I can code up a PR to get the ball rolling. Just want to co-ordinate that we don't both start implementing and have wasted effort. Let me know if you'd rather do it yourself or if you have thoughts on the above comment. Will likely start implementing in a day or so.
Sorry, for the late answer. I'm a little bit busy currently, so it sounds quite good to me, if you like to start an PR.
About your comment regarding semver: an "in-place" upgrade from panicking to compile-time errors would be definitely a breaking change, and there are multiple reason why this would be very bad here, for once, a program that panics is strictly speaking still a valid program (tho probably not very useful), so preventing them from compiling breaks them. More practically examples are:
sub-array
(you can't compile-time check that, but it wouldn't ever panic)However, there are actually wrong programs those exhibiting undefined behavior (very different from panics), and fixing them is typically not consider a breaking change, because they were totally broken before, but that does not apply here.
Actually, in my mind, I always assumed that the new API will be a breaking/major change. So I'm cool with making a breaking change, much rather than accidentally breaking someone else's code.
About the checked-ness: I totally agree that the checked variant is nice and that it would be useful to also have an "unchecked" panicking variant. But I'm not sure what a nice API would be: literal writing foo[start..end]
would just yield a slice, and I'm somewhat strongly opposed to the idea of using foo[range!(start..end)]
(foo: &[T]
) as panicking variant, because when reading code, it becomes indistinguishable from foo[range!(start..end)]
(foo: &[T;N]
) which instead of panicking would have the compile-time guarantee that if it compiles, it never panics. And this get's even worse when we consider that arrays may (in certain places) silently deref to slices.
However on a seemingly unrelated note, I also noticed that when we rewrite the API to only use the rang!
macro particularly for slices, it would be a regression in the functionality of this crate, because the with the current crate, one can actually extract an sub-array at an arbitrary run-time given offset (maybe most useful with slices, which can be arbitrarily long and one might e.g. extract a sub-array from the end).
So I though maybe we could jointly solve these two issues by introducing a second "ConstRange" type and macro, one that isn't "fully const" but rather "half const" i.e. having a const size but a variable offset (sort of how the current API works). I'm kinda struggling with a good name for it, maybe something like ConstSizeRange
(emphasizing that only the size is const) or VariableOffsetRange
(emphasizing that only the offset is variable). The macro would be named analogously, it should accept the same patterns as the ConstRange
but needs additional rules to be able to specify the variable offset (I really think the old foo<SIZE>(offset)
style is kinda bad), maybe some special-case rule that parses var_off_range!(n..n+3)
(where n
can be a variable) can do it.
Then we can still support the arbitrary offset use-case while also having a distinguishing feature to implement directly panicking indexing:
BoundsCheck at: | Compile Time | Run Time |
---|---|---|
Slice | n/a | foo[var_off_range!(3..5)] |
Array | foo[range!(3..5)] |
foo[var_off_range!(3..5)] |
Lastly, about naming: I actually wasn't that much aware about the "missing" ref
s in the std-lib, I think I was looking at the Any
-trait, which adds the ref
suffix to disambiguate between by-value and by-reference use-cases. However, I don't really see a strong use-case for by-value extractions, which couldn't be achieved by just copying or cloning the result of the by-reference functions. So I think we can get rid of the ref
suffix.
Hi, just touching base again. Haven't forgotten about this but things got a bit busy my side. I still intend to do this PR but after reading your [great] suggestions I've realised it's a bit larger than I originally thought. Just letting you know I haven't disappeared.
I like your idea! And the name! (hence why I'm asking to collaborate rather than publishing my own crate).
How do you feel about trying to make this api a bit closer to the normal slicing API?
I've played around a bit and the closest I can get is this:
I think with a little extra
proc_macro
trickery we may be able to get to the following:The main value I believe this crate would bring is to provide a safe API for doing sub-array slicing on arrays. You can enforce the safety via compile time constraints using trait bounds. This is unfortunately not possible using slices, so I would also propose the API on slices get removed. It feels a bit strange to have it in a crate with the word
array
in anyway.There definitely are some disadvantages of doing what I'm suggesting. The main one is that it requires
generic_const_exprs
feature to do the trait bound constraints, which is still unstable. This means the consumers of this library need to use thenightly
compiler until it get stabilised.Have a look at this playground to get an idea of what I'm suggesting.
The main thing you may notice where I've gone in a separate direction from you is the use of the
SubArray
wrapper struct. This is to allow overloading of theIndex
trait. I'm not married to the idea, but I do think it has some advantages.Let me know what you think! :)