rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
95.11k stars 12.27k forks source link

Tracking Issue for `substr_range` and related methods #126769

Open wr7 opened 3 weeks ago

wr7 commented 3 weeks ago

Feature gate: #![feature(substr_range)]

This is a tracking issue for str::substr_range, slice::subslice_range, and slice::elem_offset as described in this ACP.

These methods can be used to extend str::lines, str::split, slice::split, and other related methods.

Public API

impl str {
    fn substr_range(&self, substr: &str) -> Option<Range<usize>>;
}

impl<T> [T] {
    fn subslice_range(&self, subslice: &[T]) -> Option<Range<usize>>;
    fn elem_offset(&self, elem: &T) -> Option<usize>;
}

Steps / History

Unresolved Questions

tgross35 commented 3 weeks ago

Bikeshed: element_offset or item_offset seems better to me than abbreviating to elem, there aren't many truncated words in the slice method names. And e.g. std::simd uses rotate_elements_{left,right} and {Mask,Simd}Element (that is unstable API, but has been around for a while)

wr7 commented 3 weeks ago

Bikeshed: element_offset or item_offset seems better to me than abbreviating to elem, there aren't many truncated words in the slice method names. And e.g. std::simd uses rotate_elements_{left,right} and {Mask,Simd}Element (that is unstable API, but has been around for a while)

+1 I'm somewhat impartial to the naming, but I think naming it element_offset might make sense. It's much more descriptive than "item", and the current documentation for slices appears to use the term "element" much more than "item".

I originally thought that "element" seems kinda long, but it's actually shorter than "subslice", so I think it's fine.

The elem_offset naming for the method was from a t-libs-api meeting, so I think I'll wait to see if I get more feedback from the rust community on the name before renaming it.

ericlagergren commented 3 weeks ago

Should they be const?

wr7 commented 3 weeks ago

Should they be const?

Good question

I think that it would be great for them to be const, but a const implementation does not seem to be trivial in current-day rust.

I can see two main ways of implementing something similar to the proposed methods:

  1. (Current implementation) Cast the pointers into usizes and then use wrapping_sub and wrapping_add on them. The issue with this is that you cannot cast pointers into usizes in const contexts. I don't see this changing too soon in the future either.
  2. Using pointer::byte_offset_from. This is const, and this would work most of the time, but in cases where this method should return None, pointer::byte_offset_from would invoke undefined behavior, and there is no obvious way to get around this.

Fortunately, the most common use cases aren't currently const either. Methods like str::split return iterators which currently cannot be used in const contexts.

Making these methods const is a non-breaking change, so I personally think that we should move forwards with non-const implementations and file a separate issue in the future if constness is desired enough.