Open cramertj opened 5 years ago
Since the linked implementation PR is closed (just due to inactivity) we don't really have an issue to track yet. I'll go ahead and close this one, but if #62282 resurfaces in the future we can re-open!
Re-opening because of https://github.com/rust-lang/rust/pull/77065
https://doc.rust-lang.org/nightly/std/ops/trait.OneSidedRange.html links to https://github.com/rust-lang/rust/issues/69780 which is closed. Either it should be reopened (and some details filled in) or #[unstable]
attributes should be changed to point here instead, depending on whether it’s useful to track OneSidedRange
separately.
Can we implement OneSidedRange
for RangeFull
as well? I'd like to be able to say self.slice.take(..)
as a shorthand for std::mem::swap(&mut self.slice, &[])
(or I suppose for self.slice.take(0..)
). This doesn't fit the naming OneSidedRange
or the description that says the range is unbounded on one side[^1], but I don't see any reason why RangeFull
should not be usable here.
[^1]: Technically it doesn't say it has to be bounded on the other side, but it does list ..
as a non-compliant type.
Can we implement
OneSidedRange
forRangeFull
as well?
Perhaps also rename OneSidedRange
to UnboundedRange
, in that case?
Please, rename. Method take
is unusable with its current name, because it conflicts with std::io::Read::take
.
Let me describe my actual production use case.
First I read some data from file using std::io::Read::read_to_end
. To be able to read data I have to write use std::io::Read
. Then I started to parse that data using various functions, including <[T]>::take
. But I was unable to do this because of use std::io::Read
, see playground: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=e467873898a5f160420a5d6f13274ffa
FYI: #69780 is still being linked for OneSideRange
in the docs when that issue is closed; should the tracking issue be moved to here?
I agree with @safinaskar that a different name for take
(and take_mut
) should be considered. I like take_slice
and take_slice_mut
for the current API.
But changing the API to replace take(range)
with take_before(usize)
and take_after(usize)
(and possibly take_all()
) would be clearer and removes the need to resolve issues with OneSidedRange
.
I'd love to see taking from a slice stabilized. But the introduction of OneSidedRange
seems to be the wrong path to take. I looked at the previous discussion for the benefits of adding OneSidedRange
. They are:
OneSidedRange
did not solve the method naming problem, the original design without OneSidedRange
can also use regular indices rather than length from the end, and I don't see any other code that uses OneSidedRange
. The only objective benefit realized from the above is that the number of methods reduced from 8 to 6.
Meanwhile the OneSidedRange
code requires two match statements to turn the range parameter back into a single index and then call separate paths for the two possible directions. It also adds panic code for the unreachable variants when changing the range to an index and a direction.
The benefits of OneSidedRange
just do not justify the additional complexity, especially as part of the standard library.
I propose the creation of a new implementation PR to remove OneSidedRange
and replace take
with take_before
and take_after
methods + _mut
variants. I'm happy to make the PR (but also happy for someone else to do it). But I'd like to check if anyone is strongly opposed first.
cc @nox @cramertj @timvermeulen @ibraheemdev who worked on the previous implementation PR's.
@safinaskar this conflict is indeed very unfortunate, however do note that there are workarounds. Notably you can:
Read
trait in a different module or function (you can write use
inside a function) than the usage of the take method(&mut s).take(..1)
)<[_]>::take(&mut s, ..1)
)s.take(..1)
where s
has type &mut &[_]
)See playground.
@jongiddy I do find the current version with OneSidedRange
a lot more readable than potential take_before
/take_after
(and a third variant?).
Meanwhile the
OneSidedRange
code requires two match statements to turn the range parameter back into a single index and then call separate paths for the two possible directions. It also adds panic code for the unreachable variants when changing the range to an index and a direction.
I do not see the problem here. We have a lot of match
es in std
and this was never a problem. If you are concerned about speed, then you don't need to be, compiler can fully optimize those matches out, see this godbolt for example (note how function implemented as-is in std
with OneSidedRange
has exactly the same asm as the function which is implemented using an approach you are proposing). By the panic I assume you mean unreachable!()
in split_point_of
which is unreachable and is also fully optimized-out by the compiler.
Perhaps split_off
could solve the naming problem? It's also reserved for collections, although in this case an argument would be R: UnboundedRange
rather than usize
. You could also implement both immutable and mutable variants under a single name, if orphan rules allow inherent impls on references (and change where the documentation would reside). It's not as if take
would be useful on &mut [T]
.
Edit: Regarding the usefulness of UnboundedRange
, I think it's necessary because the index approach is unclear. Which part of the slice is returned and which part stays in source? The naming alone wouldn't necessarily prevent accidental misuse, especially given code refactoring (replacing one with the other). Note that normal splits do not have this problem, because they return tuples, and tuples are explicitly ordered.
I just discovered this API, after I once again reimplemented my own helper function for splitting off a prefix or suffix of a mutable slice :D
Looking at the comments, my feedback:
take
, for the stated reasonssplit_off
suggestion, its just unfortunate that it overlaps with methods on owning datastructures like Vec or LinkedList. However, the BTree* types use the same name for a slightly different API already, so...split_affix
, because it removes either a suffix
or prefix
of the slice 😅 split_in_place
?split_range?
Example:
let x = slice.split_range(..a);
let x = slice.split_range(b..);
let x = slice.split_range(..);
let x = slice.split_range_mut(..a);
let x = slice.split_range_mut(b..);
let x = slice.split_range_mut(..);
split_«anything»
because all existing <[T]>::split*
methods use &[T]
as the receiver rather than the double-reference &mut &[T]
.pop_*()
or .drain_*()
..verb(..4)
rather than .verb_before(4)
, but -1 on calling the trait std::ops::OneSidedRange<usize>
or std::ops::UnboundedRange<usize>
. I don't think anybody needed a generic one/zero-sided range trait, in particular this trait cannot support {start: Excluded, end: Unbounded}
. I think the trait could just live as std::slice::VerbRange
similar to std::slice::SliceIndex
.these methods should can be implemented for str
. what happened when the range does not hit the char boundary is an unresolved question.
impl str {
fn verb<'a>(self: &mut &'a Self, range: impl VerbRange) -> Option<&'a Self>;
fn verb_mut<'a>(self: &mut &'a mut Self, range: impl VerbRange) -> Option<&'a mut Self>;
// fn verb_first_char<'a>(self: &mut &'a Self) -> Option<char>;
// fn verb_last_char<'a>(self: &mut &'a Self) -> Option<char>;
fn verb_prefix<'a>(self: &mut &'a Self, pattern: impl Pattern<'a>) -> Option<&'a Self>;
fn verb_suffix<'a>(self: &mut &'a Self, pattern: impl Pattern<'a>) -> Option<&'a Self>;
}
what happened when the range does not hit the char boundary is an unresolved question.
There is precedent for split_at
to panic when given an index not at a char boundary. (Although split_at_checked
is now being added)
Side note about full ranges being included in this API, but I think calling these "unbounded ranges" instead of "one-sided ranges" would make the APIs a bit clear. Even though the one-sided ranges have bounds, they still contain an unbounded side.
I just stumbled over this feature after writing take_first
(I called it pop_front
) myself, and wondering if the standard library has something similar. I regularly use &mut &[T]
and &mut &str
, which I think is a super cool pattern I'd like to see more support for.
Has pop_front
and pop_back
been considered instead of take_first
and take_last
? That would mirror the API of VecDeque
, which supports cheap "removal" of elements from the front and back as well.
Should this also have an array version? e.g.
fn take_first_chunk<'a, const N: usize>(self: &mut &'a Self) -> Option<&'a [T; N]>;
fn take_first_chunk_mut<'a, const N: usize>(self: &mut &'a mut Self) -> Option<&'a mut [T; N]>;
fn take_last_chunk<'a, const N: usize>(self: &mut &'a Self) -> Option<&'a [T; N]>;
fn take_last_chunk_mut<'a, const N: usize>(self: &mut &'a mut Self) -> Option<&'a mut [T; N]>;
Would be very useful for reading and writing chunks of bytes from a slice. u32::from_le_bytes(*(slice.take_first_chunk()))
Feature gate:
#![feature(slice_take)]
Public API
Steps / History
Unresolved Questions
-