golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.05k stars 17.68k forks source link

runtime: remove unnecessary allocations in convT2E #8892

Open dvyukov opened 10 years ago

dvyukov commented 10 years ago
This was at lengths discussed in:
https://golang.org/issue/8405
and previously on golang-dev:
https://groups.google.com/d/msg/golang-dev/pwUh0BVFpY0/zqJInvU3NkQJ

Namely, we should allocate heap block for scalars iff the scalar look like a pointer
into heap (otherwise GC will ignore it anyway).

This will allow to have 1-bit/word GC pointer type info *and* don't allocate additional
memory for scalars in interfaces in most cases.
randall77 commented 10 years ago

Comment 1:

So if we're allocating scalars sometimes in the data word and sometimes pointed to by
the data word, then users of the interface need to distinguish those cases.  So, for
example, assertI2T for scalars must check if the result looks like a pointer into the
heap and of so, dereference it.
dvyukov commented 10 years ago

Comment 2:

Right. If it's not intended to be a pointer but looks like a pointer, dereference it.
rsc commented 10 years ago

Comment 3:

I believe we should not do this. It's too clever, and it will come back to bit eus later.
ysmolski commented 6 years ago

@dvyukov, does this issue look relevant to you?

dvyukov commented 6 years ago

@ysmolsky I don't understand the question. What do you mean?

ianlancetaylor commented 6 years ago

@dvyukov I think he is asking whether you think this issue should still be open.

Really this is a question for @aclements and @RLH. Is this a feasible optimization with the current GC?

dvyukov commented 4 years ago

It should still be a feasible optimization because we still support Go pointers to point to C heap, right?

Another potential implementation would work for value types that are smaller than pointer size (e.g. uint16, uint32 on 64-bits, bool, small structs, etc). Namely: let's say we have amd64, pointer size is 64 bits, let's say we are storing uint32 into an interface{}, when we are storing the value we put 0xffffffff into the high 32-bits. This makes this value invalid/outside-of-heap if treated as pointer. Extraction won't require any special code, we just take the low 32 bits. This would require to calculate if this optimization is applicable for a type in the compiler (based on type size, and arch), and adding some high bits in runtime/compiler. Here is a quick proof-of-concept:

--- a/src/cmd/compile/internal/gc/subr.go
+++ b/src/cmd/compile/internal/gc/subr.go
@@ -1857,6 +1857,10 @@ func isdirectiface(t *types.Type) bool {
                return t.NumFields() == 1 && isdirectiface(t.Field(0).Type)
        }

+       if t.Size() == 2 && t.Align == 2 {
+               return true
+       }
+
        return false
 }

--- a/src/runtime/iface.go
+++ b/src/runtime/iface.go
 func convT16(val uint16) (x unsafe.Pointer) {
-       if val < uint16(len(staticuint64s)) {
-               x = unsafe.Pointer(&staticuint64s[val])
-               if sys.BigEndian {
-                       x = add(x, 6)
-               }
-       } else {
-               x = mallocgc(2, uint16Type, false)
-               *(*uint16)(x) = val
-       }
-       return
+       return unsafe.Pointer(uintptr(val) | 1<<63)
 }

This passed bootstrap and almost passed go tool dist test. I guess it also needs some changes in the reflect package. And at this point we obviously want to inline this conversion wholesale because there is no point in the function call now.

Besides avoiding mallocgc call and not generating garbage, this also makes accesses faster (no indirection).

Thoughts?

@dr2chase @mknyszek

dvyukov commented 4 years ago

This is inspired by a real use case. I need to store lots of integers in interfaces, but these won't fit into staticuint64s (up to thousands/tens of thousands). But they perfectly fit into uint32 and that's what I use as the type.

josharian commented 4 years ago

I’m on my phone, but you probably also need to adjust ifaceData in the compiler. And ifaceeq and efaceeq in the runtime. (Also, it’s be more impressive to get uint8/uint32 working—uint16s are not heavily used.)

Another option, which is less likely to cause rare breakage (e.g. in assembly) is to keep an atomic counter in the slow path of convT32 for smallish values and, when used enough, persistent alloc and populate a staticuint64s-like array with a larger range.

Or keep a per-P cache of recently allocated values so that we can re-use them. (I tried this for string interning, and it was pretty straightforward.)

dr2chase commented 4 years ago

There's an endianness issue in that code, that's a minor problem, also the description of the pointer smash (0xffffffff) in upper 32 bits doesn't match what the code does (uintptr(val) | 1<<63).

But otherwise, interesting, though I am worried about assembly language.

On Sat, Apr 25, 2020 at 12:21 PM Josh Bleecher Snyder < notifications@github.com> wrote:

I’m on my phone, but you probably also need to adjust ifaceData in the compiler. And ifaceeq and efaceeq in the runtime. (Also, it’s be more impressive to get uint8/uint32 working—uint16s are not heavily used.)

Another option, which is less likely to cause rare breakage (e.g. in assembly) is to keep an atomic counter in the slow path of convT32 for smallish values and, when used enough, persistent alloc and populate a staticuint64s-like array with a larger range.

Or keep a per-P cache of recently allocated values so that we can re-use them. (I tried this for string interning, and it was pretty straightforward.)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/golang/go/issues/8892#issuecomment-619404170, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOW6JZM77AJJBGOQP7NRXLROMEXTANCNFSM4GCAPQ3Q .