tinygo-org / tinygo

Go compiler for small places. Microcontrollers, WebAssembly (WASM/WASI), and command-line tools. Based on LLVM.
https://tinygo.org
Other
15.32k stars 905 forks source link

compiler: strconv.AppendUint allocates #4525

Open eliasnaur opened 5 days ago

eliasnaur commented 5 days ago
$ cat byteslicegarbage_test.go
package byteslicegarbage

import (
    "strconv"
    "testing"
)

func TestAllocs(t *testing.T) {
    dst := make([]byte, 50)
    res := testing.Benchmark(func(b *testing.B) {
        for range b.N {
            strconv.AppendUint(dst[:0], 10000, 10)
        }
    })
    if string(dst)[:2] != "10" {
        t.Errorf("AppendUint: %q", string(dst))
    }
    if a := res.AllocsPerOp(); a != 0 {
        t.Errorf("AppendUint allocates %d, expected none", a)
    }
}
$ go test
$ tinygo test
--- FAIL: TestAllocs (2.42s)
    AppendUint allocates 1, expected none
FAIL
FAIL    blah    2.625s

This allocation is caused by the folowing pattern (from strconv.formatBits):

func formatBits(dst []byte, ..., append bool) ([]byte, string) {
    var buf [65]byte
    ...
    // Fill buf.
    ....
    if append {
        return append(dst), ""
    }
    return nil, string(buf)   
}

As you can see, the conversion to string is conditional, but TinyGo nevertheless decides to allocate buf on the heap. This saves a copy in the non-append case, but incurs an allocation in the performance-critical append case.

From a cursory glance at compiler/compiler.go, ssa.Convert operations are translated to runtime.stringFromBytes, so some other pass must transform that to the escaping buffer. Perhaps just inlining and general optimization?

dgryski commented 3 days ago

I wonder if we could get away with noescape tags on stringFromBytes ?

dgryski commented 3 days ago
~/go/src/github.com/tinygo-org/tinygo/src/runtime $ git diff
diff --git a/compiler/symbol.go b/compiler/symbol.go
index c2007cfd..5447e0db 100644
--- a/compiler/symbol.go
+++ b/compiler/symbol.go
@@ -172,6 +172,9 @@ func (c *compilerContext) getFunction(fn *ssa.Function) (llvm.Type, llvm.Value)
                llvmFn.AddAttributeAtIndex(1, c.ctx.CreateEnumAttribute(llvm.AttributeKindID("nocapture"), 0))
                llvmFn.AddAttributeAtIndex(2, c.ctx.CreateEnumAttribute(llvm.AttributeKindID("readonly"), 0))
                llvmFn.AddAttributeAtIndex(2, c.ctx.CreateEnumAttribute(llvm.AttributeKindID("nocapture"), 0))
+       case "runtime.stringFromBytes":
+               llvmFn.AddAttributeAtIndex(1, c.ctx.CreateEnumAttribute(llvm.AttributeKindID("nocapture"), 0))
+               llvmFn.AddAttributeAtIndex(1, c.ctx.CreateEnumAttribute(llvm.AttributeKindID("readonly"), 0))
        case "runtime.trackPointer":
                // This function is necessary for tracking pointers on the stack in a
                // portable way (see gc_stack_portable.go). Indicate to the optimizer

This is a fix, but I'd like @aykevl to chime in if this is the proper fix.

~/go/src/github.com/dgryski/bug/byteslicegarbage $ tinygo test -print-allocs=strconv
/Users/dgryski/go/src/go.googlesource.com/go/src/strconv/ftoa.go:164:10: object allocated on the heap: escapes at line 171
/Users/dgryski/go/src/go.googlesource.com/go/src/strconv/ftoa.go:145:7: object allocated on the heap: escapes at line 147
/Users/dgryski/go/src/go.googlesource.com/go/src/strconv/ftoa.go:117:7: object allocated on the heap: escapes at line 118
/Users/dgryski/go/src/go.googlesource.com/go/src/strconv/atoi.go:56:18: object allocated on the heap: escapes at line 56
/Users/dgryski/go/src/go.googlesource.com/go/src/strconv/atoi.go:52:18: object allocated on the heap: escapes at line 52
/Users/dgryski/go/src/go.googlesource.com/go/src/strconv/atoi.go:48:18: object allocated on the heap: escapes at line 48
/Users/dgryski/go/src/go.googlesource.com/go/src/strconv/quote.go:35:15: object allocated on the heap: size is not constant
/Users/dgryski/go/src/go.googlesource.com/go/src/strconv/quote.go:24:37: object allocated on the heap: size is not constant
ok      github.com/dgryski/bug/byteslicegarbage 1.810s

Before the PR there is an additional line:

/Users/dgryski/go/src/go.googlesource.com/go/src/strconv/itoa.go:94:6: object allocated on the heap: escapes at line 199

which is now removed.

It seems to me some of those other lines could also be investigated and improve our escape analysis.

aykevl commented 2 days ago

As you can see, the conversion to string is conditional, but TinyGo nevertheless decides to allocate buf on the heap. This saves a copy in the non-append case, but incurs an allocation in the performance-critical append case.

Are you sure it's the append? Converting a byte slice to a string also allocates (see the alloc call):

https://github.com/tinygo-org/tinygo/blob/ef4f46f1d1550beb62324d750c496b2b4a7f76d0/src/runtime/string.go#L70-L78

The reason it needs to be a new heap object is because the compiler currently can't figure out that buf is not written to anymore after converting to a string. If buf would be written to, and the string would share the underlying object, the string would be changed too which would be Very Bad. Also see: https://github.com/tinygo-org/tinygo/pull/4289

This is a fix, but I'd like @aykevl to chime in if this is the proper fix.

Looks good to me! These attributes should have been inferred by the compiler, but maybe that only happens in a later pass. I don't see a problem with adding them earlier to help the optimizer.

eliasnaur commented 2 days ago

As you can see, the conversion to string is conditional, but TinyGo nevertheless decides to allocate buf on the heap. This saves a copy in the non-append case, but incurs an allocation in the performance-critical append case.

Are you sure it's the append?

I didn't mean to imply the append allocates. What I saw in the compiled code is that buf is allocated on the heap, and the call to runtime.stringFromBytes is gone, presumably because LLVM decides that once buf is allocated it's safe to alias the memory.

It makes perfect sense to me that once runtime.stringFromBytes is marked as noescape/nocapture LLVM no longer forces buf onto the heap. Thanks @dgryski.