golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.67k stars 17.62k forks source link

runtime: stack break at s file make me can not find out error source #69920

Open epowsal opened 6 days ago

epowsal commented 6 days ago

Go version

1.23.1

Output of go env in your module/workspace:

no

What did you do?

add stack watch at go source for debug error.

What did you see happen?

/work/tool/go1.23.1.windows-amd64/src/net/http/transport.go 1545:
<-M:
/work/tool/go1.23.1.windows-amd64/src/runtime/asm_amd64.s 1700:
M:
/work/tool/go1.23.1.windows-amd64/src/net/fd_windows.go 182:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/sock_posix.go 124:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/sock_posix.go 70:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/ipsock_posix.go 167:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/tcpsock_posix.go 85:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/tcpsock_posix.go 75:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/tcpsock_posix.go 71:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/dial.go 670:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/dial.go 635:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/dial.go 536:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/dial.go 527:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/http/transport.go 1226:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/http/transport.go 1728:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/http/transport.go 1563:
<-M:
/work/tool/go1.23.1.windows-amd64/src/net/http/transport.go 1545:
<-M:
/work/tool/go1.23.1.windows-amd64/src/runtime/asm_amd64.s 1700:

It did not have stack file after s file.

What did you expect to see?

Every stack go file can got OK.

gabyhelp commented 6 days ago

Related Issues and Documentation

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

randall77 commented 5 days ago

Unfortunately, I don't really understand what this issue is intended to describe.

Please fill out the "What did you do?" section in much more detail. Do you have a program that we could reproduce with? It sounds like you are using a debugger. Which debugger? How did you invoke it? The more actual command lines you give us that we can reproduce with, the better.

epowsal commented 5 days ago

no debugger.I use runtime output all stack file. just run builded exe program and record stack to file.but the recorded stack is not complete. It lose *.go file. example main.go.

epowsal commented 5 days ago

Stack file output added in fd_windows.go

randall77 commented 5 days ago

I still don't understand the problem. We really need a precise description of what you did and what you saw.

  1. Can you share the exact program you are running? i.e., All the .go files you used to build it.
  2. Please put into this issue the exact commands you ran. All of them for building & running the program.
  3. Can you paste into the issue the exact output you're seeing?
  4. You mention "stack file" a few times. I don't know what that is. Can you explain?
epowsal commented 4 days ago

It is cmd/go.exe. I only modify fd_windows.go.

epowsal commented 4 days ago

It is cmd/go.exe. I only modified fd_windows.go.

func S(n int) string {
    fstackstr := ""
    for i := 1; i <= n; i++ {
        _, file, line, ok := runtime.Caller(i)
        if ok {
            sn := file
            if fstackstr == "" {
                fstackstr += fmt.Sprintf("%s %d:", sn, line)
            } else {
                fstackstr += fmt.Sprintf("<-%s %d:", sn, line)
            }
        }
    }
    return fstackstr
}

func AppendFile(filepath string, data []byte) error {
    ff, ffe := os.OpenFile(filepath, os.O_CREATE|os.O_WRONLY, 0666)
    if ffe != nil {
        return ffe
    }
    defer ff.Close()
    _, e2 := ff.Seek(0, os.SEEK_END)
    if e2 != nil {
        return e2
    }
    _, e := ff.Write(data)
    if e != nil {
        return e
    }
    return nil
}

// Always returns nil for connected peer address result.
func (fd *netFD) connect(ctx context.Context, la, ra syscall.Sockaddr) (syscall.Sockaddr, error) {
    // Do not need to call fd.writeLock here,
    // because fd is not yet accessible to user,
    // so no concurrent operations are possible.
    if err := fd.init(); err != nil {
        return nil, err
    }

    if ctx.Done() != nil {
        // Propagate the Context's deadline and cancellation.
        // If the context is already done, or if it has a nonzero deadline,
        // ensure that that is applied before the call to ConnectEx begins
        // so that we don't return spurious connections.
        defer fd.pfd.SetWriteDeadline(noDeadline)

        if ctx.Err() != nil {
            fd.pfd.SetWriteDeadline(aLongTimeAgo)
        } else {
            if deadline, ok := ctx.Deadline(); ok && !deadline.IsZero() {
                fd.pfd.SetWriteDeadline(deadline)
            }

            done := make(chan struct{})
            stop := context.AfterFunc(ctx, func() {
                // Force the runtime's poller to immediately give
                // up waiting for writability.
                fd.pfd.SetWriteDeadline(aLongTimeAgo)
                close(done)
            })
            defer func() {
                if !stop() {
                    // Wait for the call to SetWriteDeadline to complete so that we can
                    // reset the deadline if everything else succeeded.
                    <-done
                }
            }()
        }
    }

    if !canUseConnectEx(fd.net) {
        err := connectFunc(fd.pfd.Sysfd, ra)
        return nil, os.NewSyscallError("connect", err)
    }
    // ConnectEx windows API requires an unconnected, previously bound socket.
    if la == nil {
        switch ra.(type) {
        case *syscall.SockaddrInet4:
            la = &syscall.SockaddrInet4{}
        case *syscall.SockaddrInet6:
            la = &syscall.SockaddrInet6{}
        default:
            panic("unexpected type in connect")
        }
        if err := syscall.Bind(fd.pfd.Sysfd, la); err != nil {
            return nil, os.NewSyscallError("bind", err)
        }
    }

    var isloopback bool
    switch ra := ra.(type) {
    case *syscall.SockaddrInet4:
        isloopback = ra.Addr[0] == 127
    case *syscall.SockaddrInet6:
        isloopback = ra.Addr == [16]byte(IPv6loopback)
    default:
        panic("unexpected type in connect")
    }
    if isloopback {
        // This makes ConnectEx() fails faster if the target port on the localhost
        // is not reachable, instead of waiting for 2s.
        params := windows.TCP_INITIAL_RTO_PARAMETERS{
            Rtt:                   windows.TCP_INITIAL_RTO_UNSPECIFIED_RTT, // use the default or overridden by the Administrator
            MaxSynRetransmissions: 1,                                       // minimum possible value before Windows 10.0.16299
        }
        if windows.SupportTCPInitialRTONoSYNRetransmissions() {
            // In Windows 10.0.16299 TCP_INITIAL_RTO_NO_SYN_RETRANSMISSIONS makes ConnectEx() fails instantly.
            params.MaxSynRetransmissions = windows.TCP_INITIAL_RTO_NO_SYN_RETRANSMISSIONS
        }
        var out uint32
        // Don't abort the connection if WSAIoctl fails, as it is only an optimization.
        // If it fails reliably, we expect TestDialClosedPortFailFast to detect it.
        _ = fd.pfd.WSAIoctl(windows.SIO_TCP_INITIAL_RTO, (*byte)(unsafe.Pointer(&params)), uint32(unsafe.Sizeof(params)), nil, 0, &out, nil, 0)
    }

    //AppendFile("wlbneterr.log", []byte(S(32)))

    // Call ConnectEx API.
    if err := fd.pfd.ConnectEx(ra); err != nil {
        select {
        case <-ctx.Done():
            return nil, mapErr(ctx.Err())
        default:
            if _, ok := err.(syscall.Errno); ok {
                err = os.NewSyscallError("connectex", err)
            }
            return nil, err
        }
    }
    // Refresh socket properties.
    return nil, os.NewSyscallError("setsockopt", syscall.Setsockopt(fd.pfd.Sysfd, syscall.SOL_SOCKET, syscall.SO_UPDATE_CONNECT_CONTEXT, (*byte)(unsafe.Pointer(&fd.pfd.Sysfd)), int32(unsafe.Sizeof(fd.pfd.Sysfd))))
}
prattmic commented 1 day ago

the recorded stack is not complete. It lose *.go file. example main.go.

What do you believe is missing?

The original stack you listed ends with:

/work/tool/go1.23.1.windows-amd64/src/net/http/transport.go 1545:
/work/tool/go1.23.1.windows-amd64/src/runtime/asm_amd64.s 1700:

That location in transport.go is the start of a new goroutine (https://cs.opensource.google/go/go/+/refs/tags/go1.23.1:src/net/http/transport.go;l=1545;bpv=0), and asm_amd64.s:1700 is runtime.goexit (which always goes above goroutine start).

This seems to me. A goroutine's stack starts at the beginning of the goroutine.