proposal: bytes, strings: add CutByte

aimuz commented 2 weeks ago

Proposal Details

Abstract

This proposal suggests the addition of a new function, CutByte, to the strings and bytes packages in the Go standard library. The function aims to simplify the handling of strings and byte slices by cutting them around the first instance of a specified byte separator. This function is designed to offer a more efficient alternative to the existing Cut function when the separator is a single byte, providing up to a 25% performance improvement in typical use cases.

Background

The existing Cut function in the Go standard library has proven to be extremely useful for handling strings and byte slices by simplifying code and replacing multiple standard library functions. However, in scenarios where the separator is known to be a single byte, the Cut function can be optimized further. The proposed CutByte function addresses this by focusing on byte-level operations, which are common in many real-world applications, such as parsing binary protocols or handling ASCII-based text formats.

Details

``` ./cmd/vet/vet_test.go: if _, suffix, ok := strings.Cut(text, " "); ok { ./cmd/go/scriptreadme_test.go: _, lang, ok := strings.Cut(doc.String(), "# Script Language\n\n") ./cmd/go/scriptreadme_test.go: lang, _, ok = strings.Cut(lang, "\n\nvar ") ./cmd/go/internal/test/test.go: op, name, found := strings.Cut(s, " ") ./cmd/go/internal/vcweb/script.go: mPath, version, ok := strings.Cut(args[1], "@") ./cmd/go/internal/modfetch/fetch.go: _, vers, _ = strings.Cut(vers, "-") ./cmd/go/internal/modfetch/fetch.go: goos, goarch, _ := strings.Cut(vers[i+1:], "-") ./cmd/go/internal/modfetch/codehost/vcs.go: before, after, found := strings.Cut(line, ":") ./cmd/go/internal/modfetch/toolchain.go: _, v, ok := strings.Cut(v, "-") ./cmd/go/internal/modload/list.go: if path, vers, found := strings.Cut(arg, "@"); found { ./cmd/go/internal/modload/list.go: if path, vers, found := strings.Cut(arg, "@"); found { ./cmd/go/internal/modload/build.go: if path, vers, found := strings.Cut(path, "@"); found { ./cmd/go/internal/modload/search.go: elem, rest, found := strings.Cut(p, string(filepath.Separator)) ./cmd/go/internal/modload/init.go: line, _, _ := strings.Cut(string(buf[:n]), "\n") ./cmd/go/internal/envcmd/env.go: key, val, found := strings.Cut(arg, "=") ./cmd/go/internal/modindex/build.go: line, argstr, ok := strings.Cut(strings.TrimSpace(line[4:]), ":") ./cmd/go/internal/modindex/build.go: name, _, _ = strings.Cut(name, ".") ./cmd/go/internal/modindex/build_read.go: path, _, ok = strings.Cut(args[1:], "`") ./cmd/go/internal/script/engine.go: prefix, suffix, ok := strings.Cut(cond.tag, ":") ./cmd/go/internal/script/engine.go: if prefix, suffix, ok := strings.Cut(tag, ":"); ok { ./cmd/go/internal/script/state.go: if k, v, ok := strings.Cut(kv, "="); ok { ./cmd/go/internal/modcmd/edit.go: before, after, found := strings.Cut(arg, "@") ./cmd/go/internal/modcmd/edit.go: before, after, found := strings.Cut(arg, "@") ./cmd/go/internal/modcmd/edit.go: before, after, found := strings.Cut(s, ",") ./cmd/go/internal/modcmd/edit.go: before, after, found := strings.Cut(arg, "=") ./cmd/go/internal/load/godebug.go: k, v, ok := strings.Cut(strings.TrimSpace(text[i:]), "=") ./cmd/go/internal/vcs/vcs.go: pattern, list, found := strings.Cut(item, ":") ./cmd/go/internal/vcs/vcs.go: if scheme, path, ok := strings.Cut(repo, "://"); ok { ./cmd/go/internal/work/gccgo.go: before, after, _ := strings.Cut(args, "=") ./cmd/go/internal/gover/gomod.go: s, _, _ = strings.Cut(s, "//") // strip comments ./cmd/go/internal/gover/mod_test.go: path, vers, _ := strings.Cut(f, " ") ./cmd/go/internal/modget/query.go: pattern, rawVers, found := strings.Cut(raw, "@") ./cmd/go/internal/cmdflag/flag.go: name, value, hasValue := strings.Cut(arg[1:], "=") ./cmd/go/internal/workcmd/edit.go: before, after, found := strings.Cut(arg, "@") ./cmd/go/internal/workcmd/edit.go: before, after, found := strings.Cut(arg, "=") ./cmd/go/internal/toolchain/select.go: min, suffix, plus := strings.Cut(gotoolchain, "+") // go1.2.3+auto ./cmd/go/internal/toolchain/select.go: name, val, hasEq := strings.Cut(a, "=") ./cmd/go/internal/toolchain/select.go: name, _, _ := strings.Cut(a, "=") ./cmd/go/internal/toolchain/select.go: path, version, _ := strings.Cut(pkgArg, "@") ./cmd/go/main.go: _, dir, _ = strings.Cut(a, "=") ./cmd/distpack/pack.go: version, rest, _ := strings.Cut(string(data), "\n") ./cmd/internal/testdir/testdir_test.go: goos, goarch, ok := strings.Cut(*target, "/") ./cmd/internal/testdir/testdir_test.go: line, actionSrc, _ = strings.Cut(actionSrc, "\n") ./cmd/internal/testdir/testdir_test.go: header, _, ok := strings.Cut(src, "\npackage") ./cmd/internal/testdir/testdir_test.go: if _, suffix, ok := strings.Cut(text, " "); ok { ./cmd/internal/testdir/testdir_test.go: lines[i], _, _ = strings.Cut(lines[i], " // ERROR ") ./cmd/internal/testdir/testdir_test.go: errFile, rest, ok := strings.Cut(errStr, ":") ./cmd/internal/testdir/testdir_test.go: lineStr, msg, ok := strings.Cut(rest, ":") ./cmd/compile/internal/base/flag.go: verb, args, found := strings.Cut(line, " ") ./cmd/compile/internal/base/flag.go: before, after, hasEq := strings.Cut(args, "=") ./cmd/link/internal/ld/ld.go: verb, args, found := strings.Cut(line, " ") ./cmd/link/internal/ld/ld.go: before, after, exist := strings.Cut(args, "=") ./cmd/link/internal/ld/go.go: line, data, _ = strings.Cut(data, "\n") ./cmd/link/internal/ld/go.go: if before, after, found := strings.Cut(remote, "#"); found { ./cmd/cgo/internal/testcshared/cshared_test.go: switch prefix, _, _ := strings.Cut(name, "-"); prefix { ./cmd/fix/typecheck.go: if _, elem, ok := strings.Cut(t, "]"); ok { ./cmd/fix/typecheck.go: if _, et, ok := strings.Cut(t, "]"); ok { ./cmd/fix/typecheck.go: if kt, vt, ok := strings.Cut(t[len("map["):], "]"); ok { ./cmd/fix/typecheck.go: _, value, _ = strings.Cut(t, "]") ./cmd/fix/typecheck.go: if k, v, ok := strings.Cut(t[len("map["):], "]"); ok { ./cmd/api/main_test.go: feature, approval, ok := strings.Cut(line, "#") ./cmd/doc/dirs.go: path, dir, _ := strings.Cut(line, "\t") ./cmd/vendor/golang.org/x/tools/cmd/bisect/main.go: if _, v, _ := strings.Cut(e, "="); strings.Contains(v, "PATTERN") { ./cmd/vendor/golang.org/x/tools/cmd/bisect/main.go: k, v, _ := strings.Cut(x, "=") ./cmd/vendor/golang.org/x/tools/cmd/bisect/main.go: line, all, _ = strings.Cut(all, "\n") ./cmd/vendor/golang.org/x/tools/go/analysis/passes/directive/directive.go: line, text, _ = strings.Cut(text, "\n") ./cmd/vendor/golang.org/x/tools/go/analysis/passes/directive/directive.go: _, line, ok = strings.Cut(line, "*/") ./cmd/vendor/golang.org/x/tools/internal/versions/versions.go: v, _, _ = strings.Cut(v, "-") // strip -bigcorp suffix. ./cmd/vendor/golang.org/x/tools/internal/stdlib/stdlib.go: typename, name, _ = strings.Cut(sym.Name, ".") ./cmd/vendor/golang.org/x/tools/internal/stdlib/stdlib.go: recv, name, _ = strings.Cut(sym.Name, ".") ./cmd/vendor/golang.org/x/telemetry/internal/crashmonitor/monitor.go: _, pcstr, ok := strings.Cut(line, " pc=") // e.g. pc=0x%x ./cmd/vendor/golang.org/x/telemetry/internal/config/config.go: prefix, _, found := strings.Cut(c.Name, ":") ./cmd/vendor/golang.org/x/telemetry/internal/config/config.go: prefix, rest, hasBuckets := strings.Cut(counter, "{") ./cmd/vendor/golang.org/x/telemetry/internal/counter/parse.go: k, v, ok := strings.Cut(line, ": ") ./cmd/vendor/golang.org/x/telemetry/internal/upload/reports.go: before, _, _ := strings.Cut(k, "\n") ./cmd/vendor/golang.org/x/build/relnote/relnote.go: dir, rest, _ := strings.Cut(filename, "/") ./cmd/vendor/golang.org/x/build/relnote/relnote.go: dir, rest, _ = strings.Cut(rest, "/") ./crypto/ecdsa/ecdsa_test.go: curve, hash, _ := strings.Cut(line, ",") ./crypto/x509/pem_decrypt.go: mode, hexIV, ok := strings.Cut(dek, ",") ./crypto/x509/verify_test.go: before, _, ok := strings.Cut(string(out), ".") ./crypto/tls/handshake_test.go: _, after, ok := strings.Cut(line, " ") ./crypto/tls/handshake_test.go: before, _, ok := strings.Cut(line, "|") ./strconv/fp_test.go: if mant, exp, ok := strings.Cut(s, "p"); ok { ./strconv/fp_test.go: if mant, exp, ok := strings.Cut(s, "p"); ok { ./strings/example_test.go: before, after, found := strings.Cut(s, sep) ./net/mail/message.go: k, v, ok := strings.Cut(kv, ":") ./net/platform_test.go: net, _, _ := strings.Cut(network, ":") ./net/platform_test.go: switch net, _, _ := strings.Cut(network, ":"); net { ./net/platform_test.go: switch net, _, _ := strings.Cut(network, ":"); net { ./net/url/url.go: u, frag, _ := strings.Cut(rawURL, "#") ./net/url/url.go: rest, url.RawQuery, _ = strings.Cut(rest, "?") ./net/url/url.go: if segment, _, _ := strings.Cut(rest, "/"); strings.Contains(segment, ":") { ./net/url/url.go: username, password, _ := strings.Cut(userinfo, ":") ./net/url/url.go: if segment, _, _ := strings.Cut(path, "/"); strings.Contains(segment, ":") { ./net/url/url.go: key, query, _ = strings.Cut(query, "&") ./net/url/url.go: key, value, _ := strings.Cut(key, "=") ./net/url/url.go: elem, remaining, found = strings.Cut(remaining, "/") ./net/main_posix_test.go: net, _, _ := strings.Cut(network, ":") ./net/http/transport.go: _, text, ok := strings.Cut(resp.Status, " ") ./net/http/response.go: proto, status, ok := strings.Cut(line, " ") ./net/http/response.go: statusCode, _, _ := strings.Cut(resp.Status, " ") ./net/http/request.go: username, password, ok = strings.Cut(cs, ":") ./net/http/request.go: method, rest, ok1 := strings.Cut(line, " ") ./net/http/request.go: requestURI, proto, ok2 := strings.Cut(rest, " ") ./net/http/cgi/host_test.go: k, v, ok := strings.Cut(trimmedLine, "=") ./net/http/cgi/host.go: header, val, ok := strings.Cut(string(line), ":") ./net/http/cgi/child.go: if k, v, ok := strings.Cut(kv, "="); ok { ./net/http/fs.go: start, end, ok := strings.Cut(ra, "-") ./net/http/cookie.go: name, value, found := strings.Cut(s, "=") ./net/http/cookie.go: name, value, ok := strings.Cut(parts[0], "=") ./net/http/cookie.go: attr, val, _ := strings.Cut(parts[i], "=") ./net/http/cookie.go: part, line, _ = strings.Cut(line, ";") ./net/http/cookie.go: name, val, _ := strings.Cut(part, "=") ./net/http/client_test.go: first, rest, _ := strings.Cut(final, ",") ./net/http/main_test.go: _, stack, _ := strings.Cut(g, "\n") ./net/smtp/smtp.go: k, v, _ := strings.Cut(line, " ") ./net/main_test.go: _, stack, _ := strings.Cut(s, "\n") ./go/types/eval_test.go: before, after, _ := strings.Cut(s, sep) ./go/importer/importer_test.go: compiler, target, _ := strings.Cut(export, ":") ./go/printer/comment.go: line, text, _ = strings.Cut(text, "\n") ./go/printer/printer.go: if p, _, ok := strings.Cut(prefix, "*"); ok { ./go/printer/printer.go: before, _, _ := strings.Cut(last, closing) // closing always present ./go/constant/value_test.go: if ns, ds, ok := strings.Cut(a[1], "/"); ok && kind == token.FLOAT { ./go/constant/value_test.go: if as, bs, ok := strings.Cut(lit, "/"); ok { ./go/version/version.go: v, _, _ = strings.Cut(v, "-") // strip -bigcorp suffix. ./go/doc/example_test.go: name, kind, found := strings.Cut(sectionName, ".") ./go/doc/comment/text.go: line, text, _ = strings.Cut(text, "\n") ./go/doc/comment/markdown.go: line, md, _ = strings.Cut(md, "\n") ./go/doc/comment/print.go: line, md, _ = strings.Cut(md, "\n") ./go/doc/comment/print.go: line, rest, ok := strings.Cut(s, "\n") ./go/doc/comment/parse.go: if _, b, ok = strings.Cut(b, "'"); !ok { ./go/doc/comment/parse.go: if _, b, ok = strings.Cut(b, "."); !ok { ./go/doc/headscan.go: inner, s, _ = strings.Cut(s[loc[1]:], html_endh) ./go/build/read_test.go: beforeP, afterP, _ := strings.Cut(tt.in, "ℙ") ./go/build/read_test.go: if beforeD, afterD, ok := strings.Cut(beforeP, "𝔻"); ok { ./go/build/build.go: line, argstr, ok := strings.Cut(strings.TrimSpace(line[4:]), ":") ./go/build/build.go: name, _, _ = strings.Cut(name, ".") ./go/build/vendor_test.go: _, pkg, found = strings.Cut(fullPkg, "/vendor/") ./go/build/build_test.go: errStr, _, _ = strings.Cut(errStr, ";") ./go/build/read.go: path, _, ok = strings.Cut(args[1:], "`") ./go/build/constraint/vers.go: _, v, _ := strings.Cut(z.Tag, "go1.") ./regexp/exec_test.go: loStr, hiStr, _ := strings.Cut(pair, "-") ./regexp/exec_test.go: if _, flag, ok = strings.Cut(flag[1:], ":"); !ok { ./regexp/regexp.go: before, after, ok := strings.Cut(template, "$") ./regexp/syntax/parse.go: lit, t, _ = strings.Cut(t[2:], `\E`) ./archive/tar/writer_test.go: prefix, _, _ = strings.Cut(prefix, "\x00") // Truncate at the NUL terminator ./archive/tar/strconv.go: ss, sn, _ := strings.Cut(s, ".") ./archive/tar/strconv.go: nStr, rest, ok := strings.Cut(s, " ") ./archive/tar/strconv.go: k, v, ok = strings.Cut(rec, "=") ./path/filepath/path.go: part, p, _ = strings.Cut(p, "/") ./runtime/testdata/testprog/traceback_ancestors.go: g, all, _ = strings.Cut(all, "\n\n") ./runtime/testdata/testprog/traceback_ancestors.go: id, _, _ := strings.Cut(strings.TrimPrefix(g, "goroutine "), " ") ./runtime/pprof/proto_test.go: in, out, ok := strings.Cut(tt, "->\n") ./runtime/pprof/pprof_test.go: if _, s, ok = strings.Cut(s, t); !ok { ./runtime/pprof/pprof_test.go: base, kv, ok := strings.Cut(spec, ";") ./runtime/pprof/pprof_test.go: k, v, ok := strings.Cut(kv, "=") ./runtime/pprof/proto.go: loStr, hiStr, ok := strings.Cut(string(addr), "-") ./runtime/trace_cgo_test.go: prefix, tracePath, found := strings.Cut(got, ":") ./runtime/traceback_test.go: _, tb, _ = strings.Cut(tb, "\n") ./runtime/traceback_test.go: line, tb, _ = strings.Cut(tb, "\n") ./runtime/traceback_test.go: funcName, args, found := strings.Cut(line, "(") ./runtime/traceback_system_test.go: _, pcstr, ok := strings.Cut(line, " pc=") // e.g. pc=0x%x ./runtime/debug/mod.go: line, data, ok = strings.Cut(data, newline) ./runtime/debug/mod.go: key, rawValue, ok = strings.Cut(kv, "=") ./internal/testenv/testenv.go: line, goMod, _ = strings.Cut(goMod, "\n") ./internal/testenv/testenv.go: importPath, export, ok := strings.Cut(line, "=") ./internal/zstd/zstd_test.go: want, _, _ := strings.Cut(name, ".") ./encoding/asn1/common.go: part, str, _ = strings.Cut(str, ",") ./encoding/xml/typeinfo.go: if ns, t, ok := strings.Cut(tag, " "); ok { ./encoding/xml/xml.go: } else if space, local, ok := strings.Cut(s, ":"); !ok || space == "" || local == "" { ./encoding/json/tags.go: tag, opt, _ := strings.Cut(tag, ",") ./encoding/json/tags.go: name, s, _ = strings.Cut(s, ",") ./html/template/js.go: mimeType, _, _ = strings.Cut(mimeType, ";") ./html/template/url.go: if protocol, _, ok := strings.Cut(s, ":"); ok && !strings.Contains(protocol, "/") { ./html/template/attr.go: } else if prefix, short, ok := strings.Cut(name, ":"); ok { ./testing/testing_test.go: if name, _, ok := strings.Cut(trimmed, " "); ok { ./log/slog/slogtest_test.go: kv, rest, _ := strings.Cut(s, " ") // assumes exactly one space between attrs ./log/slog/slogtest_test.go: k, value, found := strings.Cut(kv, "=") ./mime/mediatype.go: if major, sub, ok := strings.Cut(t, "/"); !ok { ./mime/mediatype.go: base, _, _ := strings.Cut(v, ";") ./mime/mediatype.go: if baseName, _, ok := strings.Cut(key, "*"); ok { ./mime/encodedword.go: charset, text, _ := strings.Cut(word, "?") ./mime/encodedword.go: encoding, text, _ := strings.Cut(text, "?") ./os/user/lookup_unix.go: u.Name, _, _ = strings.Cut(u.Name, ",") ./os/user/cgo_lookup_unix.go: u.Name, _, _ = strings.Cut(u.Name, ",") ./os/os_test.go: host, _, ok := strings.Cut(hostname, ".") ./os/exec/exec_windows_test.go: k, _, ok := strings.Cut(kv, "=") ./os/exec/exec_test.go: errLine, body, ok := strings.Cut(string(bs), "\n") ./os/exec/exec.go: k, _, ok := strings.Cut(kv, "=") ./text/template/option.go: if key, value, ok := strings.Cut(opt, "="); ok { ```

Proposal

We propose adding the following functions:

For the `strings` package:

// CutByte slices s around the first instance of sep,
// returning the text before and after sep.
// The found result reports whether sep appears in s.
// If sep does not appear in s, CutByte returns s, "", false.
func CutByte(s string, sep byte) (before, after string, found bool) {
    if i := IndexByte(s, sep); i >= 0 {
        return s[:i], s[i+1:], true
    }
    return s, "", false
}

For the `bytes` package:

// CutByte slices s around the first instance of sep,
// returning the text before and after sep.
// The found result reports whether sep appears in s.
// If sep does not appear in s, CutByte returns s, nil, false.
// CutByte returns slices of the original slice s, not copies.
func CutByte(s []byte, sep byte) (before, after []byte, found bool) {
    if i := IndexByte(s, sep); i >= 0 {
        return s[:i], s[i+1:], true
    }
    return s, nil, false
}

Rationale

The CutByte function is specifically optimized for cases where the separator is a single byte. In the analysis of Go's main repository and several large-scale Go projects, a significant number of string manipulations involve single-byte separators. The performance benefit of CutByte over Cut for these cases is approximately 25%, as measured in benchmarks comparing the two functions under similar conditions.

Compatibility

CutByte is a new addition and does not modify any existing interfaces or behavior in the Go standard library. It follows the established patterns and naming conventions of the Go ecosystem, ensuring that it integrates seamlessly with the existing library functions.

Implementation

The implementation of CutByte is straightforward and leverages the existing IndexByte function in both the strings and bytes packages. The proposed functions do not introduce any new dependencies or significant complexities.

Conclusion

Adding CutByte to the Go standard library will provide developers with a more efficient tool for handling common string and byte slice operations involving single-byte separators. This function not only enhances performance but also maintains readability and simplicity, aligning with Go's philosophy of clear and efficient coding practices.

gopherbot commented 2 weeks ago

Change https://go.dev/cl/582176 mentions this issue: bytes, strings: add CutByte

adonovan commented 2 weeks ago

Can the performance of the existing function be improved without adding a new one? For example, how does this compare:

func Cut(s, sep []byte) (before, after []byte, found bool) {
    if len(sep) == 1 {
        if i := IndexByte(s, sep); i >= 0 {
            return s[:i], s[i+1:], true
        }
    }
    if i := Index(s, sep); i >= 0 {
        return s[:i], s[i+len(sep):], true
    }
    return s, nil, false
}

mateusz834 commented 2 weeks ago

@adonovan Index already has a optimization for one-byte strings

https://github.com/golang/go/blob/16ce8b3925deaeb72541ee96b6ee23a08fc21dea/src/internal/stringslite/strings.go#L25-L31

ianlancetaylor commented 2 weeks ago

Please don't use images for plain text. Images are hard to read. Plain text is easy. Thanks.

ianlancetaylor commented 2 weeks ago

See #22148 for a relevant discussion.

adonovan commented 2 weeks ago

Index already has a optimization for one-byte strings

So it does. Are you saying that even with this fastpath, the overhead of evaluating len(substr), two switch conditions, and substr[0] makes it 25% slower? Could you furnish the benchmark you used?

aimuz commented 2 weeks ago

Please don't use images for plain text. Images are hard to read. Plain text is easy. Thanks.

I'm sorry, I've changed it to text, to avoid too much content, now you need to click on Details to see the content

Can the performance of the existing function be improved without adding a new one? For example, how does this compare:

func Cut(s, sep []byte) (before, after []byte, found bool) {
  if len(sep) == 1 {
      if i := IndexByte(s, sep); i >= 0 {
          return s[:i], s[i+1:], true
      }
  }
  if i := Index(s, sep); i >= 0 {
      return s[:i], s[i+len(sep):], true
  }
  return s, nil, false
}

Such code will not improve anything and will even degrade performance

Index already has a optimization for one-byte strings

So it does. Are you saying that even with this fastpath, the overhead of evaluating len(substr), two switch conditions, and substr[0] makes it 25% slower? Could you furnish the benchmark you used?

This is my benchmark code.

func BenchmarkCut(b *testing.B) {
    b.ReportAllocs()
    skip := 32
    s := Repeat(Repeat(" ", skip)+"a"+Repeat(" ", skip), 1<<16/skip)
    b.Run(fmt.Sprintf("Func=Cut-%d", skip), func(b *testing.B) {
        for i := 0; i < b.N; i++ {
            _, _, _ = Cut(s, "a")
        }
    })
    b.Run(fmt.Sprintf("Func=CutByte-%d", skip), func(b *testing.B) {
        for i := 0; i < b.N; i++ {
            _, _, _ = CutByte(s, 'a')
        }
    })
}

benchstat -col /Func  new.txt  
goos: darwin
goarch: arm64
pkg: strings
cpu: Apple M2 Max
       │   Cut-32    │             CutByte-32              │
       │   sec/op    │   sec/op     vs base                │
Cut-12   4.644n ± 3%   3.614n ± 1%  -22.18% (p=0.000 n=10)

       │   Cut-32   │           CutByte-32           │
       │    B/op    │    B/op     vs base            │
Cut-12   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

       │   Cut-32   │           CutByte-32           │
       │ allocs/op  │ allocs/op   vs base            │
Cut-12   0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

I would prefer to use the following benchmarking procedure so that I can do benchmark comparisons on different samples. When I tried to run benchstat -col /Func new.txt I got unexpected results, so I switched to the benchmarking above, using a fixed sample of tests


func BenchmarkCut(b *testing.B) {
    b.ReportAllocs()

    for _, skip := range [...]int{2, 4, 8, 16, 32, 64} {
        s := Repeat(Repeat(" ", skip)+"a"+Repeat(" ", skip), 1<<16/skip)
        b.Run(fmt.Sprintf("Func=Cut-%d", skip), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
                _, _, _ = Cut(s, "a")
            }
        })
        b.Run(fmt.Sprintf("Func=CutByte-%d", skip), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
                _, _, _ = CutByte(s, 'a')
            }
        })
    }
}

See #22148 for a relevant discussion.

IndexByte gets a significant performance boost after CL 148578. So I don't think "Using strings.IndexByte is premature optimisation." is entirely applicable now.

aimuz commented 2 weeks ago

If I've missed any information, please clue me in, thanks!

egonelbre commented 2 weeks ago

I benchmarked this using https://github.com/egonelbre/exp/blob/main/bench/cutbyte/bench_test.go -- and then combining the results, I got the result (using Go master):

cpu: AMD Ryzen Threadripper 2950X 16-Core Processor
               │     Cut     │              CutByte               │
               │   sec/op    │   sec/op     vs base               │
Cut/skip=2-32    75.39n ± 2%   75.79n ± 2%       ~ (p=0.152 n=24)
Cut/skip=4-32    75.85n ± 1%   75.65n ± 1%       ~ (p=0.477 n=24)
Cut/skip=8-32    75.15n ± 2%   76.11n ± 1%       ~ (p=0.270 n=24)
Cut/skip=16-32   75.85n ± 1%   75.80n ± 2%       ~ (p=0.927 n=24)
Cut/skip=32-32   76.68n ± 1%   77.08n ± 3%       ~ (p=0.717 n=24)
Cut/skip=64-32   76.34n ± 1%   76.95n ± 2%       ~ (p=0.724 n=24)
geomean          75.88n        76.23n       +0.47%

Individual results:

cpu: AMD Ryzen Threadripper 2950X 16-Core Processor
                │     Cut     │              CutByte              │
                │   sec/op    │   sec/op     vs base              │
Cut0/skip=2-32    76.13n ± 1%   73.40n ± 3%  -3.59% (p=0.002 n=6)
Cut0/skip=4-32    75.76n ± 4%   72.25n ± 3%  -4.63% (p=0.002 n=6)
Cut0/skip=8-32    75.47n ± 3%   71.11n ± 3%  -5.78% (p=0.002 n=6)
Cut0/skip=16-32   76.63n ± 3%   70.83n ± 3%  -7.56% (p=0.002 n=6)
Cut0/skip=32-32   76.25n ± 4%   71.83n ± 4%  -5.81% (p=0.009 n=6)
Cut0/skip=64-32   75.91n ± 4%   72.01n ± 3%  -5.14% (p=0.002 n=6)
Cut1/skip=2-32    73.66n ± 2%   75.65n ± 3%  +2.69% (p=0.015 n=6)
Cut1/skip=4-32    75.80n ± 4%   75.72n ± 1%       ~ (p=0.848 n=6)
Cut1/skip=8-32    74.35n ± 2%   76.17n ± 2%  +2.44% (p=0.015 n=6)
Cut1/skip=16-32   74.25n ± 2%   75.65n ± 1%       ~ (p=0.180 n=6)
Cut1/skip=32-32   75.48n ± 2%   76.36n ± 4%       ~ (p=0.485 n=6)
Cut1/skip=64-32   75.60n ± 2%   77.56n ± 3%       ~ (p=0.180 n=6)
Cut2/skip=2-32    73.91n ± 2%   77.81n ± 3%  +5.29% (p=0.002 n=6)
Cut2/skip=4-32    74.81n ± 4%   77.52n ± 4%       ~ (p=0.093 n=6)
Cut2/skip=8-32    74.51n ± 4%   77.62n ± 2%  +4.17% (p=0.004 n=6)
Cut2/skip=16-32   76.07n ± 2%   78.75n ± 2%  +3.52% (p=0.002 n=6)
Cut2/skip=32-32   76.93n ± 2%   78.31n ± 2%  +1.79% (p=0.041 n=6)
Cut2/skip=64-32   76.45n ± 2%   78.09n ± 1%  +2.14% (p=0.009 n=6)
Cut3/skip=2-32    76.83n ± 2%   77.17n ± 3%       ~ (p=0.699 n=6)
Cut3/skip=4-32    75.93n ± 2%   75.77n ± 5%       ~ (p=0.909 n=6)
Cut3/skip=8-32    76.61n ± 2%   76.11n ± 1%       ~ (p=0.937 n=6)
Cut3/skip=16-32   75.64n ± 2%   75.89n ± 2%       ~ (p=0.818 n=6)
Cut3/skip=32-32   76.84n ± 9%   78.55n ± 2%       ~ (p=0.589 n=6)
Cut3/skip=64-32   77.31n ± 3%   76.65n ± 2%       ~ (p=0.310 n=6)
geomean           75.71n        75.66n       -0.07%

I used multiple implementations of the same code, because microbenchmarks are prone to statistical noise due to code layout differences.

aimuz commented 2 weeks ago

Too slow in your tests, maybe a faulty benchmark?

https://github.com/egonelbre/exp/blob/main/bench/cutbyte/bench_test.go#L65

Can you accept it without a global variable?

diff --git a/bench/cutbyte/bench_test.go b/bench/cutbyte/bench_test.go
index f56b9ac..0489ac5 100644
--- a/bench/cutbyte/bench_test.go
+++ b/bench/cutbyte/bench_test.go
@@ -62,8 +62,6 @@ func CutByte3(s string, sep byte) (before, after string, found bool) {
    return s, "", false
 }

-var x, y, z any
-
 func BenchmarkCut0(b *testing.B) {
    b.ReportAllocs()

@@ -71,13 +69,15 @@ func BenchmarkCut0(b *testing.B) {
        s := strings.Repeat(strings.Repeat(" ", skip)+"a"+strings.Repeat(" ", skip), 1<<16/skip)
        b.Run(fmt.Sprintf("func=Cut/skip=%d", skip), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
-               x, y, z = Cut0(s, "a")
+               x, y, z := Cut0(s, "a")
+               _, _, _ = x, y, z
            }
        })

        b.Run(fmt.Sprintf("func=CutByte/skip=%d", skip), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
-               x, y, z = CutByte0(s, 'a')
+               x, y, z := CutByte0(s, 'a')
+               _, _, _ = x, y, z
            }
        })
    }
@@ -89,13 +89,15 @@ func BenchmarkCut1(b *testing.B) {
        s := strings.Repeat(strings.Repeat(" ", skip)+"a"+strings.Repeat(" ", skip), 1<<16/skip)
        b.Run(fmt.Sprintf("func=Cut/skip=%d", skip), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
-               x, y, z = Cut1(s, "a")
+               x, y, z := Cut1(s, "a")
+               _, _, _ = x, y, z
            }
        })

        b.Run(fmt.Sprintf("func=CutByte/skip=%d", skip), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
-               x, y, z = CutByte1(s, 'a')
+               x, y, z := CutByte1(s, 'a')
+               _, _, _ = x, y, z
            }
        })
    }
@@ -108,13 +110,15 @@ func BenchmarkCut2(b *testing.B) {
        s := strings.Repeat(strings.Repeat(" ", skip)+"a"+strings.Repeat(" ", skip), 1<<16/skip)
        b.Run(fmt.Sprintf("func=Cut/skip=%d", skip), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
-               x, y, z = Cut2(s, "a")
+               x, y, z := Cut2(s, "a")
+               _, _, _ = x, y, z
            }
        })

        b.Run(fmt.Sprintf("func=CutByte/skip=%d", skip), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
-               x, y, z = CutByte2(s, 'a')
+               x, y, z := CutByte2(s, 'a')
+               _, _, _ = x, y, z
            }
        })
    }
@@ -127,14 +131,16 @@ func BenchmarkCut3(b *testing.B) {
        s := strings.Repeat(strings.Repeat(" ", skip)+"a"+strings.Repeat(" ", skip), 1<<16/skip)
        b.Run(fmt.Sprintf("func=Cut/skip=%d", skip), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
-               x, y, z = Cut3(s, "a")
+               x, y, z := Cut3(s, "a")
+               _, _, _ = x, y, z
            }
        })

        b.Run(fmt.Sprintf("func=CutByte/skip=%d", skip), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
-               x, y, z = CutByte3(s, 'a')
+               x, y, z := CutByte3(s, 'a')
+               _, _, _ = x, y, z
            }
        })
    }
-}
\ No newline at end of file
+}

I modified your test case a bit and got the following results

goos: darwin
goarch: arm64
pkg: github.com/egonelbre/exp/bench/cutbyte
                │     Cut     │               CutByte               │
                │   sec/op    │   sec/op     vs base                │
Cut0/skip=2-12    4.172n ± 3%   3.139n ± 1%  -24.75% (p=0.000 n=10)
Cut0/skip=4-12    4.143n ± 2%   3.132n ± 2%  -24.40% (p=0.000 n=10)
Cut0/skip=8-12    4.177n ± 3%   3.174n ± 2%  -24.01% (p=0.000 n=10)
Cut0/skip=16-12   4.151n ± 2%   3.147n ± 1%  -24.17% (p=0.000 n=10)
Cut0/skip=32-12   4.633n ± 3%   3.582n ± 1%  -22.70% (p=0.000 n=10)
Cut0/skip=64-12   4.854n ± 5%   3.936n ± 2%  -18.91% (p=0.000 n=10)
Cut1/skip=2-12    4.172n ± 2%   3.152n ± 2%  -24.45% (p=0.000 n=10)
Cut1/skip=4-12    4.175n ± 3%   3.171n ± 1%  -24.04% (p=0.000 n=10)
Cut1/skip=8-12    4.209n ± 2%   3.161n ± 1%  -24.91% (p=0.000 n=10)
Cut1/skip=16-12   4.232n ± 3%   3.159n ± 1%  -25.33% (p=0.000 n=10)
Cut1/skip=32-12   4.601n ± 5%   3.593n ± 1%  -21.92% (p=0.000 n=10)
Cut1/skip=64-12   5.016n ± 3%   3.857n ± 0%  -23.11% (p=0.000 n=10)
Cut2/skip=2-12    4.153n ± 2%   3.134n ± 1%  -24.52% (p=0.000 n=10)
Cut2/skip=4-12    4.152n ± 1%   3.128n ± 1%  -24.66% (p=0.000 n=10)
Cut2/skip=8-12    4.123n ± 2%   3.135n ± 1%  -23.95% (p=0.000 n=10)
Cut2/skip=16-12   4.154n ± 2%   3.111n ± 2%  -25.13% (p=0.000 n=10)
Cut2/skip=32-12   4.548n ± 3%   3.559n ± 1%  -21.75% (p=0.000 n=10)
Cut2/skip=64-12   4.965n ± 4%   3.853n ± 1%  -22.41% (p=0.000 n=10)
Cut3/skip=2-12    4.098n ± 3%   3.128n ± 1%  -23.67% (p=0.000 n=10)
Cut3/skip=4-12    4.154n ± 3%   3.123n ± 2%  -24.81% (p=0.000 n=10)
Cut3/skip=8-12    4.308n ± 3%   3.114n ± 1%  -27.71% (p=0.000 n=10)
Cut3/skip=16-12   4.181n ± 3%   3.125n ± 1%  -25.27% (p=0.000 n=10)
Cut3/skip=32-12   4.569n ± 3%   3.575n ± 1%  -21.76% (p=0.000 n=10)
Cut3/skip=64-12   4.942n ± 4%   3.883n ± 1%  -21.42% (p=0.000 n=10)
geomean           4.360n        3.324n       -23.76%

                │     Cut      │               CutByte               │
                │     B/op     │    B/op     vs base                 │
Cut0/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                      ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                │     Cut      │               CutByte               │
                │  allocs/op   │ allocs/op   vs base                 │
Cut0/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut0/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut1/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut2/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut3/skip=64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                      ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

Thanks to @egonelbre's tip, I changed the benchmarking to the following way

func BenchmarkCut(b *testing.B) {
    b.ReportAllocs()

    for _, skip := range [...]int{2, 4, 8, 16, 32, 64} {
        s := Repeat(Repeat(" ", skip)+"a"+Repeat(" ", skip), 1<<16/skip)
        b.Run(fmt.Sprintf("Func=Cut/%d", skip), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
                _, _, _ = Cut(s, "a")
            }
        })
        b.Run(fmt.Sprintf("Func=CutByte/%d", skip), func(b *testing.B) {
            for i := 0; i < b.N; i++ {
                _, _, _ = CutByte(s, 'a')
            }
        })
    }
}

benchstat -col /Func  new.txt 
goos: darwin
goarch: arm64
pkg: strings
cpu: Apple M2 Max
          │     Cut     │               CutByte               │
          │   sec/op    │   sec/op     vs base                │
Cut/2-12    4.134n ± 2%   3.171n ± 2%  -23.30% (p=0.000 n=10)
Cut/4-12    4.190n ± 2%   3.208n ± 2%  -23.44% (p=0.000 n=10)
Cut/8-12    4.137n ± 2%   3.210n ± 1%  -22.42% (p=0.000 n=10)
Cut/16-12   4.128n ± 1%   3.223n ± 1%  -21.92% (p=0.000 n=10)
Cut/32-12   4.617n ± 1%   3.630n ± 1%  -21.39% (p=0.000 n=10)
Cut/64-12   4.957n ± 1%   3.927n ± 0%  -20.76% (p=0.000 n=10)
geomean     4.349n        3.383n       -22.21%

          │     Cut      │               CutByte               │
          │     B/op     │    B/op     vs base                 │
Cut/2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

          │     Cut      │               CutByte               │
          │  allocs/op   │ allocs/op   vs base                 │
Cut/2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

egonelbre commented 2 weeks ago

@aimuz, I'm storing in global vars to ensure that code doesn't get optimized away. It could avoid any though to be faster.

I updated the code, and I'm getting:

cpu: AMD Ryzen Threadripper 2950X 16-Core Processor
               │     Cut      │               CutByte               │
               │    sec/op    │   sec/op     vs base                │
Cut/skip=2-32     8.306n ± 1%   7.235n ± 2%  -12.90% (p=0.000 n=24)
Cut/skip=4-32     8.362n ± 1%   7.188n ± 1%  -14.04% (p=0.000 n=24)
Cut/skip=8-32     8.433n ± 7%   7.396n ± 2%  -12.30% (p=0.000 n=24)
Cut/skip=16-32    8.347n ± 1%   7.240n ± 1%  -13.25% (p=0.000 n=24)
Cut/skip=32-32    9.220n ± 1%   7.882n ± 1%  -14.51% (p=0.000 n=24)
Cut/skip=64-32   10.050n ± 1%   8.607n ± 1%  -14.35% (p=0.000 n=24)
geomean           8.764n        7.575n       -13.56%

gopherbot commented 2 weeks ago

Change https://go.dev/cl/582655 mentions this issue: bytes, strings: optimize Cut for single-byte separators

aimuz commented 2 weeks ago

If the length is 1, it does improve, but if the length is greater than 1, it will increase accordingly. Maybe this is acceptable?

The code and improved implementation of the benchmark test will be available at https://go.dev/cl/582655

goos: darwin
goarch: arm64
pkg: strings
cpu: Apple M2 Max
                  │ old-cut.txt │             new-cut.txt             │
                  │   sec/op    │   sec/op     vs base                │
Cut/Cut-One/2-12    4.026n ± 2%   3.274n ± 2%  -18.68% (p=0.000 n=10)
Cut/Cut-Two/2-12    8.093n ± 0%   8.357n ± 0%   +3.27% (p=0.000 n=10)
Cut/Cut-One/4-12    4.048n ± 1%   3.324n ± 2%  -17.91% (p=0.000 n=10)
Cut/Cut-Two/4-12    8.105n ± 0%   8.377n ± 1%   +3.35% (p=0.000 n=10)
Cut/Cut-One/8-12    4.089n ± 1%   3.290n ± 1%  -19.53% (p=0.000 n=10)
Cut/Cut-Two/8-12    8.107n ± 1%   8.359n ± 1%   +3.10% (p=0.000 n=10)
Cut/Cut-One/16-12   4.127n ± 1%   3.328n ± 1%  -19.35% (p=0.000 n=10)
Cut/Cut-Two/16-12   8.119n ± 1%   8.374n ± 1%   +3.15% (p=0.000 n=10)
Cut/Cut-One/32-12   4.545n ± 2%   3.675n ± 1%  -19.14% (p=0.000 n=10)
Cut/Cut-Two/32-12   8.708n ± 1%   8.963n ± 1%   +2.92% (p=0.000 n=10)
Cut/Cut-One/64-12   4.825n ± 2%   4.146n ± 1%  -14.08% (p=0.000 n=10)
Cut/Cut-Two/64-12   9.286n ± 0%   9.315n ± 1%        ~ (p=0.105 n=10)
geomean             5.983n        5.486n        -8.32%

                  │ old-cut.txt  │             new-cut.txt             │
                  │     B/op     │    B/op     vs base                 │
Cut/Cut-One/2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                        ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                  │ old-cut.txt  │             new-cut.txt             │
                  │  allocs/op   │ allocs/op   vs base                 │
Cut/Cut-One/2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/2-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/4-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/8-12    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/16-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/32-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-One/64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Cut/Cut-Two/64-12   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                        ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

aimuz commented 2 weeks ago

Maybe CL 582655 is the best way to go for the current situation? He can get this one optimised without modifying the code in other locations. But for applications with sep greater than 1, he will have a performance loss greater than 2%, is this acceptable?

In go code, I've investigated, and the proportion of sep lengths of 1 is quite large. Maybe we can merge CL 582655?

golang / go