rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.9k stars 12.78k forks source link

Tracking issue for `slice_take` #62280

Open cramertj opened 5 years ago

cramertj commented 5 years ago

Feature gate: #![feature(slice_take)]

Public API

impl<T> [T] {
    fn take<'a, R: OneSidedRange<usize>>(self: &mut &'a Self, range: R) -> Option<&'a Self>;
    fn take_mut<'a, R: OneSidedRange<usize>>(self: &mut &'a mut Self, range: R) -> Option<&'a mut Self>;
    fn take_first<'a>(self: &mut &'a Self) -> Option<&'a T>;
    fn take_first_mut<'a>(self: &mut &'a mut Self) -> Option<&'a mut T>;
    fn take_last<'a>(self: &mut &'a Self) -> Option<&'a T>;
    fn take_last_mut<'a>(self: &mut &'a mut Self) -> Option<&'a mut T>;
}

// core::ops

trait OneSidedRange<T: ?Sized>: RangeBounds<T> {}
impl<T> OneSidedRange<T> for RangeTo<T> where Self: RangeBounds<T>;
impl<T> OneSidedRange<T> for RangeFrom<T> where Self: RangeBounds<T>;
impl<T> OneSidedRange<T> for RangeToInclusive<T> where Self: RangeBounds<T>;

Steps / History

Unresolved Questions

-

KodrAus commented 4 years ago

Since the linked implementation PR is closed (just due to inactivity) we don't really have an issue to track yet. I'll go ahead and close this one, but if #62282 resurfaces in the future we can re-open!

m-ou-se commented 3 years ago

Re-opening because of https://github.com/rust-lang/rust/pull/77065

SimonSapin commented 2 years ago

https://doc.rust-lang.org/nightly/std/ops/trait.OneSidedRange.html links to https://github.com/rust-lang/rust/issues/69780 which is closed. Either it should be reopened (and some details filled in) or #[unstable] attributes should be changed to point here instead, depending on whether it’s useful to track OneSidedRange separately.

lilyball commented 2 years ago

Can we implement OneSidedRange for RangeFull as well? I'd like to be able to say self.slice.take(..) as a shorthand for std::mem::swap(&mut self.slice, &[]) (or I suppose for self.slice.take(0..)). This doesn't fit the naming OneSidedRange or the description that says the range is unbounded on one side[^1], but I don't see any reason why RangeFull should not be usable here.

[^1]: Technically it doesn't say it has to be bounded on the other side, but it does list .. as a non-compliant type.

andersk commented 2 years ago

Can we implement OneSidedRange for RangeFull as well?

Perhaps also rename OneSidedRange to UnboundedRange, in that case?

safinaskar commented 1 year ago

Please, rename. Method take is unusable with its current name, because it conflicts with std::io::Read::take.

Let me describe my actual production use case.

First I read some data from file using std::io::Read::read_to_end. To be able to read data I have to write use std::io::Read. Then I started to parse that data using various functions, including <[T]>::take. But I was unable to do this because of use std::io::Read, see playground: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=e467873898a5f160420a5d6f13274ffa

clarfonthey commented 1 year ago

FYI: #69780 is still being linked for OneSideRange in the docs when that issue is closed; should the tracking issue be moved to here?

jongiddy commented 1 year ago

I agree with @safinaskar that a different name for take (and take_mut) should be considered. I like take_slice and take_slice_mut for the current API.

But changing the API to replace take(range) with take_before(usize) and take_after(usize) (and possibly take_all()) would be clearer and removes the need to resolve issues with OneSidedRange.

jongiddy commented 1 year ago

I'd love to see taking from a slice stabilized. But the introduction of OneSidedRange seems to be the wrong path to take. I looked at the previous discussion for the benefits of adding OneSidedRange. They are:

OneSidedRange did not solve the method naming problem, the original design without OneSidedRange can also use regular indices rather than length from the end, and I don't see any other code that uses OneSidedRange. The only objective benefit realized from the above is that the number of methods reduced from 8 to 6.

Meanwhile the OneSidedRange code requires two match statements to turn the range parameter back into a single index and then call separate paths for the two possible directions. It also adds panic code for the unreachable variants when changing the range to an index and a direction.

The benefits of OneSidedRange just do not justify the additional complexity, especially as part of the standard library.

I propose the creation of a new implementation PR to remove OneSidedRange and replace take with take_before and take_after methods + _mut variants. I'm happy to make the PR (but also happy for someone else to do it). But I'd like to check if anyone is strongly opposed first.

cc @nox @cramertj @timvermeulen @ibraheemdev who worked on the previous implementation PR's.

WaffleLapkin commented 1 year ago

@safinaskar this conflict is indeed very unfortunate, however do note that there are workarounds. Notably you can:

See playground.

WaffleLapkin commented 1 year ago

@jongiddy I do find the current version with OneSidedRange a lot more readable than potential take_before/take_after (and a third variant?).

Meanwhile the OneSidedRange code requires two match statements to turn the range parameter back into a single index and then call separate paths for the two possible directions. It also adds panic code for the unreachable variants when changing the range to an index and a direction.

I do not see the problem here. We have a lot of matches in std and this was never a problem. If you are concerned about speed, then you don't need to be, compiler can fully optimize those matches out, see this godbolt for example (note how function implemented as-is in std with OneSidedRange has exactly the same asm as the function which is implemented using an approach you are proposing). By the panic I assume you mean unreachable!() in split_point_of which is unreachable and is also fully optimized-out by the compiler.

Kolsky commented 11 months ago

Perhaps split_off could solve the naming problem? It's also reserved for collections, although in this case an argument would be R: UnboundedRange rather than usize. You could also implement both immutable and mutable variants under a single name, if orphan rules allow inherent impls on references (and change where the documentation would reside). It's not as if take would be useful on &mut [T].

Edit: Regarding the usefulness of UnboundedRange, I think it's necessary because the index approach is unclear. Which part of the slice is returned and which part stays in source? The naming alone wouldn't necessarily prevent accidental misuse, especially given code refactoring (replacing one with the other). Note that normal splits do not have this problem, because they return tuples, and tuples are explicitly ordered.

Kimundi commented 9 months ago

I just discovered this API, after I once again reimplemented my own helper function for splitting off a prefix or suffix of a mutable slice :D

Looking at the comments, my feedback:

Example:

let x = slice.split_range(..a);
let x = slice.split_range(b..);
let x = slice.split_range(..);

let x = slice.split_range_mut(..a);
let x = slice.split_range_mut(b..);
let x = slice.split_range_mut(..);
kennytm commented 6 months ago
  1. -1 renaming the method to split_«anything» because all existing <[T]>::split* methods use &[T] as the receiver rather than the double-reference &mut &[T]
  2. If we really want to rename these methods I suggest .pop_*() or .drain_*().
  3. +1 on using .verb(..4) rather than .verb_before(4), but -1 on calling the trait std::ops::OneSidedRange<usize> or std::ops::UnboundedRange<usize>. I don't think anybody needed a generic one/zero-sided range trait, in particular this trait cannot support {start: Excluded, end: Unbounded}. I think the trait could just live as std::slice::VerbRange similar to std::slice::SliceIndex.
  4. these methods should can be implemented for str. what happened when the range does not hit the char boundary is an unresolved question.

    impl str {
        fn verb<'a>(self: &mut &'a Self, range: impl VerbRange) -> Option<&'a Self>;
        fn verb_mut<'a>(self: &mut &'a mut Self, range: impl VerbRange) -> Option<&'a mut Self>;
    //  fn verb_first_char<'a>(self: &mut &'a Self) -> Option<char>;
    //  fn verb_last_char<'a>(self: &mut &'a Self) -> Option<char>;
        fn verb_prefix<'a>(self: &mut &'a Self, pattern: impl Pattern<'a>) -> Option<&'a Self>;
        fn verb_suffix<'a>(self: &mut &'a Self, pattern: impl Pattern<'a>) -> Option<&'a Self>;
    }
SimonSapin commented 6 months ago

what happened when the range does not hit the char boundary is an unresolved question.

There is precedent for split_at to panic when given an index not at a char boundary. (Although split_at_checked is now being added)

clarfonthey commented 6 months ago

Side note about full ranges being included in this API, but I think calling these "unbounded ranges" instead of "one-sided ranges" would make the APIs a bit clear. Even though the one-sided ranges have bounds, they still contain an unbounded side.

NyxCode commented 3 months ago

I just stumbled over this feature after writing take_first (I called it pop_front) myself, and wondering if the standard library has something similar. I regularly use &mut &[T] and &mut &str, which I think is a super cool pattern I'd like to see more support for.

Has pop_front and pop_back been considered instead of take_first and take_last? That would mirror the API of VecDeque, which supports cheap "removal" of elements from the front and back as well.

ds84182 commented 3 weeks ago

Should this also have an array version? e.g.

    fn take_first_chunk<'a, const N: usize>(self: &mut &'a Self) -> Option<&'a [T; N]>;
    fn take_first_chunk_mut<'a, const N: usize>(self: &mut &'a mut Self) -> Option<&'a mut [T; N]>;
    fn take_last_chunk<'a, const N: usize>(self: &mut &'a Self) -> Option<&'a [T; N]>;
    fn take_last_chunk_mut<'a, const N: usize>(self: &mut &'a mut Self) -> Option<&'a mut [T; N]>;

Would be very useful for reading and writing chunks of bytes from a slice. u32::from_le_bytes(*(slice.take_first_chunk()))