golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.37k stars 17.58k forks source link

proposal: strings and bytes: add CutSpace function #63194

Open cblach opened 1 year ago

cblach commented 1 year ago

Golang's cut function is very useful for reducing boilerplate, when parsing text/byte segment. Likewise the bytes.Fields and strings.Fields is useful for parsing spacing separated data fields.

However, sometimes when parsing text it's useful to Cut using white spaces as seperator and considering the the whole whitespace segment as a single seperator (like Fields does). For instance when scanning and parsing a config file that should allow any whitespace as key-value separator, this allows separation of the key and value with less boilerplate:

package main
import(
    "bufio"
    "fmt"
    "strings"
    "unicode"
)

func main() {
    config := "key\tthis is the key value\n" +
        "otherkey another key value\n" +
        "keywithnovalue"
    scanner := bufio.NewScanner(strings.NewReader(config))

    for scanner.Scan() {
        key, value, found := strings.CutSpace(scanner.Text())
        fmt.Println("key:", key)
        if found {
            fmt.Println("value:", value)
        }
    }
}

I propose adding strings.CutSpace and bytes.CutSpace which slices around the each instance of one or more consecutive white space characters, as defined by unicode.IsSpace, returning the text before and after the white space characters.

strings.CutSpace:

// CutSpace slices s around the each instance of one or more consecutive white space
// characters, as defined by unicode.IsSpace, returning the text before and after the
// white space characters. The found result reports white space characters appears in s.
// If no whitespace characters appear in s, CutSpace returns s, "", false.
func CutSpace(s string) (before, after string, found bool) {
    i := indexFunc(s, unicode.IsSpace, true)
    if i == -1 {
        return s, "", false
    }
    return s[:i], TrimLeftFunc(s[i:], unicode.IsSpace), true
}

bytes.CutSpace

// CutSpace slices s around the each instance of one or more consecutive white space
// characters, as defined by unicode.IsSpace, returning the text before and after the
// white space characters. The found result reports white space characters appears in s.
// If no whitespace characters appear in s, CutSpace returns s, nil, false.
//
// CutSpace returns slices of the original slice s, not copies.
func CutSpace(s []byte) (before, after []byte, found bool) {
    i := indexFunc(s, unicode.IsSpace, true)
    if i == -1 {
        return s, nil, false
    }
    return s[:i], TrimLeftFunc(s[i:], unicode.IsSpace), true
}
seankhliao commented 1 year ago

how common is this?

46336 had a lot of justification for just strings.Cut and nothing else.

gopherbot commented 1 year ago

Change https://go.dev/cl/530835 mentions this issue: strings: added CutSpace, bytes: added CutSpace

ianlancetaylor commented 1 year ago

Do you know whether there is code in the Go standard library that would benefit from this? One of the strongest arguments for adding strings.Cut was the number of places that it could be used in the standard library (see #46336).