shirou / gopsutil

psutil for golang
Other
10.66k stars 1.59k forks source link

wrong usage of os.FindProcess #1709

Open NitroCao opened 2 months ago

NitroCao commented 2 months ago

Describe the bug When using gopsutil to traverse all processes on Linux systems which support pidfd_open(2), the process will hold too many file descriptors with pidfd type. The bug is caused by the wrong usage of os.FindProcess() in PidExistsWithContext, we need call proc.Release() immediately after os.FindProcess() returns: https://github.com/shirou/gopsutil/blob/e89f21d7dcc0d59e36b08a2b0a6309345f42d3fe/process/process_posix.go#L103-L112 The correct code would be:

func PidExistsWithContext(ctx context.Context, pid int32) (bool, error) {
    if pid <= 0 {
        return false, fmt.Errorf("invalid pid %v", pid)
    }
    proc, err := os.FindProcess(int(pid))
    if err != nil {
        return false, err
    }
    defer proc.Release()

    if isMount(common.HostProcWithContext(ctx)) { // if /<HOST_PROC>/proc exists and is mounted, check if /<HOST_PROC>/proc/<PID> folder exists

The doc in os library:

Release releases any resources associated with the Process p, rendering it unusable in the future. Release only needs to be called if Process.Wait is not.

The reason is for kernels which support pidfd_open(2), os.FindProcess() will call pidfd_open(2) and save the fd into Process.handle, and then close it when (*Process).Release() called. https://github.com/golang/go/blob/e9a500f47dadcd73c970649a1072d28997617610/src/os/exec_unix.go#L156-L170

func findProcess(pid int) (p *Process, err error) {
    h, err := pidfdFind(pid)
    if err == ErrProcessDone {
        // We can't return an error here since users are not expecting
        // it. Instead, return a process with a "done" state already
        // and let a subsequent Signal or Wait call catch that.
        return newDoneProcess(pid), nil
    } else if err != nil {
        // Ignore other errors from pidfdFind, as the callers
        // do not expect them. Fall back to using the PID.
        return newPIDProcess(pid), nil
    }
    // Use the handle.
    return newHandleProcess(pid, h), nil
}

https://github.com/golang/go/blob/e9a500f47dadcd73c970649a1072d28997617610/src/os/exec.go#L219-L221

func (p *Process) handlePersistentRelease(reason processStatus) processStatus {
    if p.mode != modeHandle {
        panic("handlePersistentRelease called in invalid mode")
    }

    for {
        refs := p.state.Load()
        status := processStatus(refs & processStatusMask)
        if status != statusOK {
            // Both Release and successful Wait will drop the
            // Process' persistent reference on the handle. We
            // can't allow concurrent calls to drop the reference
            // twice, so we use the status as a guard to ensure the
            // reference is dropped exactly once.
            return status
        }
        if refs == 0 {
            // This should never happen because dropping the
            // persistent reference always sets a status.
            panic("release of handle with refcount 0")
        }
        new := (refs - 1) | uint64(reason)
        if !p.state.CompareAndSwap(refs, new) {
            continue
        }
        if new&^processStatusMask == 0 {
            p.closeHandle()
        }
        return status
    }
}

Besides SendSignalWithContext is also affected.

To Reproduce

package main

import (
    "log"
    "os"
    "time"

    "github.com/shirou/gopsutil/v4/process"
)

func main() {
    log.Printf("my pid: %d", os.Getpid())
    if err != nil {
        log.Fatal(err)
    }
    pids, err := process.Pids()
    for _, pid := range pids {
        p, err := process.NewProcess(pid)
        if err != nil {
            continue
        }
        log.Println(p.Pid)
    }

    log.Printf("the number of running procs: %d", len(pids))
    time.Sleep(30 * time.Second)
}

compile it by executing go build -o demo main.go. After executing the demo code, use ls -l /proc/$(pgrep demo)/fd | grep pidfd | wc -l command to count the number of pidfds. Expected behavior

Environment (please complete the following information):

Additional context