shirou / gopsutil

psutil for golang
Other
10.48k stars 1.58k forks source link

Use netlink API to get net connections for better performance #695

Open wcc526 opened 5 years ago

wcc526 commented 5 years ago

/proc interface is inadequate, unfortunately. When amount of sockets is enough large, netstat or even plain cat /proc/net/tcp/ cause nothing but pains and curses. In linux-2.4 the desease became worse: even if amount of sockets is small reading /proc/net/tcp/ is slow enough https://www.cyberciti.biz/files/ss.html

https://github.com/shirou/gopsutil/blob/master/net/net_linux.go

like ss command

https://github.com/raboof/connbeat/blob/master/sockets/tcp_diag/tcp_diag.go

https://stackoverflow.com/questions/11763376/difference-between-netstat-and-ss-in-linux

https://pcarleton.com/2018/05/31/netstat-vs-ss/

Lomanic commented 5 years ago

That looks doable, there is even quite a few netlink libraries available for Go. Would you be interested working on it @wcc526?

wcc526 commented 5 years ago

OK,I tried it.

Brian-Williams commented 4 years ago

@shirou Expanding on the idea to use gosigar from this comment.

I briefly looked into this and I think there are some open questions about how this can/should be implemented.

Here is a non-working implementation outline to create talking points:

func sigarLookup(ctx context.Context, kind string) ([]ConnectionStat, error) {
    tmap, ok := familyKindMap[kind]
    if !ok {
        return nil, fmt.Errorf("invalid kind, %s", kind)
    }
    msg := sigar.NewInetDiagReqV2(tmap)
    // TODO: set correct sequence 
    // msg.Header.Seq = 
    msgs, err := sigar.NetlinkInetDiag(msg)
    if err != nil {
        return nil, fmt.Errorf("couldn't send netlink message: %v", err)
    }
    var cs []ConnectionStat
    for _, diag := range msgs {
        conn := ConnectionStat{
            //Fd:     c.fd,
            Family: uint32(diag.Family),
            // Breaking change: this will always be netlink; for backwards compat we would have to loop over previous
            // tmap and use that info
            Type:   syscall.AF_NETLINK,
            Laddr:  Addr{diag.SrcIP().String(), uint32(diag.SrcPort())},
            Raddr:  Addr{diag.DstIP().String(), uint32(diag.DstPort())},
            Status: sigar.TCPState(diag.State).String(),
            // todo: I believe int32(diag.UID) is the UID of the req, so can't be used as the Uids field this may mean
            // that WithoutUids functions may still be relevant post sigar change
            //Uids: []int32{int32(diag.UID)},
            //Pid:    c.pid,
        }
        cs = append(cs, conn)
    }
    return cs, nil
}

Do we make a breaking change to the Type field of ConnectionStat? If we report netlink it is accurate, but changes the struct. If we don't then the response is lying about how the information was collected.

Do we have a fall-through if netlink request fails to previous implementation?

func ConnectionsPidMaxWithContext(ctx context.Context, kind string, pid int32, max int) ([]ConnectionStat, error) {
    ret, err := sigarLookup(ctx, kind)
    if err != nil {
        return ret, nil
    }
    # current ConnectionsPidMaxWithContext implementation here?

I would generally be against this because it is basically silently failing and attempting sigar every time even if it will never work on the host.

Based on my parsing of netlink I believe we may still need to parse /proc/ files to get pid and uid information. Would we potentially deprecate the WithoutUids funcs and do a functional replacement of WithNetlink, which only gathers the information that netlink can query?

shirou commented 4 years ago

Thank you for the info. I still can not have a time but I agree to gosigar can not satisfy gopsutil current feature. So I am planing to re-implement by myself (or any contribution is always welcome!)

wcc526 commented 3 years ago

there is open source project https://github.com/dean2021/goss

florianl commented 3 months ago

In https://github.com/shirou/gopsutil/pull/1660 a proper netlink based implementation for net.Connections() is proposed. I'm happy to get your feedback on this proposed change.

cforce commented 2 months ago

In #1660 a proper netlink based implementation for net.Connections() is proposed. I'm happy to get your feedback on this proposed change.

Is the performance/cpu load better or worse finally? There seems to be different opinions https://github.com/shirou/gopsutil/pull/1660#issuecomment-2169227677

Will that as well improve the performance of accessing process statistics via the /proc filesystem ? see related effort https://github.com/open-telemetry/opentelemetry-collector/issues/8789

florianl commented 1 month ago

@cforce if using https://github.com/shirou/gopsutil/pull/1660 in a real world application, CPU consumption is reduced by 10% and heap size is also reduced by 8 to 10%. Data accuracy is another reason for https://github.com/shirou/gopsutil/pull/1660. With netlink one gets all network connections at once. While with the current approach short the returned resuts can contain network connections that no longer exists or even miss network connections.

Once https://github.com/mdlayher/netlink/pull/215 got merged, the memory consumption is expected to be even less.