golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.92k stars 17.52k forks source link

cmd/compile: compiler assumes `[2]T` does not alias with `struct{a, b T}` which might be allowed by the `unsafe` package #65535

Open Jorropo opened 7 months ago

Jorropo commented 7 months ago

Go version

go version devel go1.23-6076edc55c Mon Feb 5 20:59:15 2024 +0000 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/hugo/.cache/go-build'
GOENV='/home/hugo/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/hugo/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/hugo/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/hugo/k/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/hugo/k/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='devel go1.23-6076edc55c Mon Feb 5 20:59:15 2024 +0000'
GCCGO='/usr/bin/gccgo'
GOAMD64='v3'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build1276548902=/tmp/go-build -gno-record-gcc-switches'

What did you do?

While investing #65495 I noticed that the compiler makes assumptions about aliasing of structs which are not documented in unsafe.

See this piece of code:

package main

import "unsafe"
import "fmt"

type S0 struct{ a, b S1 }
type S1 struct{ a, b S2 }
type S2 struct{ a, b S3 }
type S3 struct{ a, b S4 }
type S4 struct{ a, b S5 }
type S5 struct{ a, b S6 }
type S6 struct{ a, b S7 }
type S7 struct{ a, b S8 }
type S8 struct{ a, b int }

//go:noinline
func Copy(a, b *S0) {
    *a = *b
}

func main() {
    var arr [3]S1
    arr[0].a.a.a.a.a.a.a.a = 42
    arr[1].a.a.a.a.a.a.a.a = 43
    arr[2].a.a.a.a.a.a.a.a = 44
    Copy((*S0)(unsafe.Pointer(&arr[1])), (*S0)(unsafe.Pointer(&arr[0])))
    fmt.Println(arr[2].a.a.a.a.a.a.a.a)
}

What did you see happen?

42

What did you expect to see?

43
Jorropo commented 7 months ago

This happen because the compiler use a non saving copy to handle the copy:

  TEXT command-line-arguments.Copy(SB), ABIInternal
  MOVQ AX, DI
  MOVQ BX, SI
  MOVL $512, CX
  REP
  MOVSQ
  RET

I would guess duffcopy has the same issue.

It seems the compiler assumes structs are either not aliased at all or completely aliased (a == b then copy is a no-op). It's unclear to me why unsafe's doc would disallow this.

(1) Conversion of a T1 to Pointer to T2. Provided that T2 is no larger than T1 and that the two share an equivalent memory layout, this conversion allows reinterpreting data of one type as data of another type.

(*S0)(unsafe.Pointer(&arr[1])) is legal because &arr[1] point to 2 S1 which is just as large and has the same memory layout as S0. Is also legal for the same reasons (*S0)(unsafe.Pointer(&arr[0])) (it now points to 3 S1 which is larger than 1 S0).


I don't know if this could be a bug in unsafe's docs, or in the compiler.


I don't know if it possible to manifest this bug without using unsafe.

Jorropo commented 7 months ago

I found this, so it looks like a sentence is needed in unsafe: https://github.com/golang/go/blob/6076edc55c548878c261316f3e3294f1f73125a3/src/cmd/compile/internal/ssagen/ssa.go#L1382-L1399

randall77 commented 7 months ago

This conversion already violates unsafe rules:

(*S0)(unsafe.Pointer(&arr[1]))

This violates rule 1, casting from a smaller to a larger pointed-to type.

Jorropo commented 7 months ago

I did thought about that. The correct version would be (*S0)(unsafe.Add(unsafe.Pointer(&arr), unsafe.Sizeof(S1{})) it's more verbose and I was lazy.

randall77 commented 7 months ago

Reinterpreting a [2]T to a struct{a, b T} is definitely stretching what is allowed by the unsafe rules. But I don't see any rule explicitly disallowing it.

mknyszek commented 7 months ago

In triage, we agree that the kind of conversion mentioned in this issue is a bit of a stretch. You could argue that converting [2]T to a struct{a, b T} is incorrectly assuming equivalent memory layouts, especially since the spec (technically) allows for things like reordering struct fields.

mknyszek commented 7 months ago

We should perhaps consider clarifying this case in unsafe, in which case this is really more an unsafe package issue as it was originally, but I retitled it to more directly described what @Jorropo discovered.

qiulaidongfeng commented 7 months ago

What about this code that treats any as [2]uintptr?(Currently any is equivalent to a structure of two Pointers, thus converted no problem or rely on existing compiler behavior?) https://gitee.com/qiulaidongfeng/arena/blob/master/buf.go#L150

randall77 commented 7 months ago

@qiulaidongfeng That would almost certainly fall under the same non-guarantee. Layouts of multi-word builtin types (interfaces, slices, strings, complex) should not be guaranteed to be interchangeable by unsafe recasting.

dominikh commented 3 months ago

In triage, we agree that the kind of conversion mentioned in this issue is a bit of a stretch. You could argue that converting [2]T to a struct{a, b T} is incorrectly assuming equivalent memory layouts, especially since the spec (technically) allows for things like reordering struct fields.

This might need to be reevaluated given https://github.com/golang/go/issues/66408