golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.98k stars 17.67k forks source link

cmd/compile: struct field alive too long #24263

Open randall77 opened 6 years ago

randall77 commented 6 years ago

Program 1:

package main

import (
    "fmt"
    "runtime"
)

func main() {
    y := make([]int, 1e6)
    runtime.GC()
    var stat runtime.MemStats
    runtime.ReadMemStats(&stat)
    fmt.Println(stat.Alloc)
    fmt.Println(len(y))
}

Program 2:

package main

import (
    "fmt"
    "runtime"
)

func main() {
    type T struct {
        x int
        y *[1e6]int
    }
    t := T{x: 7, y: new([1e6]int)}
    runtime.GC()
    var stat runtime.MemStats
    runtime.ReadMemStats(&stat)
    fmt.Println(stat.Alloc)
    fmt.Println(t.x)
}

In program 1, the allocation at make is garbage collected, so program 1 prints just ~100KB of usage. In program 2, the allocation at new is not garbage collected, so it prints ~8MB.

The allocation in program 2 should be GCd, it is no longer referenced. For some reason the compiler is keeping all of t alive even though only one field of it is alive after GC. Strangely, if I replace fmt.Println with println, then the allocation gets GCd. I'm not entirely sure what is going on yet, but it looks like t is being treated as addressable. But it is getting decomposed correctly. Strange.

First reported here: https://groups.google.com/forum/#!topic/golang-nuts/4qmn5gw7OBQ

cherrymui commented 6 years ago

fmt.Println takes interface, so t.x is converted to interface, where its address is taken (passed to convT2E64). println doesn't take interface. Maybe this is the reason?

gua-pian commented 6 years ago

In program2, when t passed to fmt.Println(), it is treated as interface{}, as an interface, it should keep its real value and its type descriptor.

zigo101 commented 6 years ago

@cherrymui the address of a copy of t.x will be taken. It should has no relations to t itself. @dong-hao in program 2, it is t.x instead of t is passed to fmt.Println.

gua-pian commented 6 years ago

@go101 you are right. Consider that t.x is not a variable independent, when pass it to fmt.Println(), t should pass or keep in one piece.I have try

y := t.x
fmt.Println(y)

And the memory usage is low as program1.

cherrymui commented 6 years ago

@go101 Even the interface value actually holds the address of a copy of t.x, when making the interface, it is the address of t.x that is passed to convT2E64. convT2E64 makes the copy.

package p

import "fmt"

func f() {
    type T struct {
        x int
        y *[1e6]int
    }
    t := T{x: 7, y: new([1e6]int)}
    fmt.Println(t.x)
}
    0x004c 00076 (x.go:11)  LEAQ    type.int(SB), AX
    0x0053 00083 (x.go:11)  MOVQ    AX, (SP)
    0x0057 00087 (x.go:11)  LEAQ    "".t+48(SP), AX
    0x005c 00092 (x.go:11)  MOVQ    AX, 8(SP)
    0x0061 00097 (x.go:11)  PCDATA  $0, $1
    0x0061 00097 (x.go:11)  CALL    runtime.convT2E64(SB)

Maybe we can change convT2E64 to take a uint64 value instead of pointer? I'm not sure about its impact and concerns, though.

josharian commented 6 years ago

Maybe we can change convT2E64 to take a uint64 value instead of pointer?

That’s a good idea. Should be safe, because it is guaranteed to be scalar—pointer-shaped values don’t go through a convT2x call. Same probably holds for other specialized functions in runtime/iface.go, although for the multi-word ones (slice, string), it might be better to stick with an address to keep the call site small.

This would make a decent starter change for someone who wants to work on the runtime+compiler. Not trivial, but straightforward enough. Or you could just do it. :)

randall77 commented 6 years ago

That would solve this particular instance, but it doesn't solve the general problem of:

type T struct {
    a A
    b B
}
t := ...
fmt.Printf(t.b) // Retains pointers in t.a

But maybe that's ok and I'm just being too critical of a live variable analysis that is necessarily going to be conservative in some situations.

type T struct {
   a A
   b [4]byte
}
t := ...
x := t.b[:] // this reference will retain pointers in t.a

The thing that I think originally tripped me up is that fmt.Println(t.x) really doesn't look like we're taking the address of t. Maybe there is some way to detect that we're building an interface from part of a larger object and force a separate allocation in that case. I think this would only happen for stack allocations, so maybe forcing a copy of the underlying data before passing the address of it to convT2E and friends would work.

josharian commented 6 years ago

Filed #24286 for Cherry's suggestion.