rust-lang / libs-team

The home of the library team
Apache License 2.0
110 stars 18 forks source link

Add `str::reverse` method for in place string reversal #353

Closed alexstanovoy closed 4 months ago

alexstanovoy commented 4 months ago

Proposal

Problem statement

Currently, to reverse a UTF-8 string you need a to write a freestanding function like fn reverse(&mut str) or use s.chars().rev().collect(), which requires an allocation.

Motivating examples or use cases

I haven't remembered a real-world example; it's in my list for a while. Why not though? :)

Solution sketch

The design should be the same as [T]::reverse. First, reverse all bytes, then reverse every char with respect to UTF-8 standard.

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

Second, if there's a concrete solution:

BurntSushi commented 4 months ago

I haven't remembered a real-world example; it's in my list for a while. Why not though? :)

I think we need something more compelling than this for std. And I don't see the correctness problems with reversing based on codepoint acknowledged here.

Note that bstr has reverse_bytes, reverse_chars and reverse_graphemes. Arguably, reverse_graphemes is the most correct and the least likely to produce surprising results. But std doesn't have grapheme segmentation.

alexstanovoy commented 4 months ago

But std doesn't have grapheme segmentation.

Just in case, am I right that since std doesn't have grapheme segmentation, the following code is correct for code point reversal? https://pastebin.com/9qnamCrD If so, I'll add a real example and open a pull request.

alexstanovoy commented 4 months ago

Yeah, it looks like usually, users expect to reverse graphemes, not the codepoints. Although it may be useful to have idiomatic alternative to .chars().rev().collect() inside the std, after your comment I don't think it's really useful. Thanks for your time! :)

scottmcm commented 4 months ago

My favourite example for why .chars().rev() is sketchy: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0cf380056e06601b85f7c6a7887e90ae

[src/main.rs:2:5] "🇸🇪".chars().rev().collect::() = "🇪🇸"

I think the common places you might want this are leetcode style things, where https://github.com/rust-lang/rust/issues/110998 might be the better way, since <[ascii::Char]>::reverse will work fine.