mdlayher / raw

Package raw enables reading and writing data at the device driver level for a network interface. MIT Licensed.
MIT License
425 stars 71 forks source link

raw: add BSD runtime network poller integration #41

Open mdlayher opened 5 years ago

mdlayher commented 5 years ago

An analogous change to https://github.com/mdlayher/raw/pull/40 should be made for BSD, eliminating the need for a read loop with sleeps in the middle.

tatsushid commented 5 years ago

I was interested in to implement this and researched if it was possible. As the result, I found out now (at least Go 1.13) it is impossible.

On BSD systems, it uses kqueue for I/O multiplexing and Go also uses it internally. To implement like the Linux case, BPF device file has to be registered to that internal kqueue. The kqueue registration part of code is netpollopen function in runtime/netpoll_kqueue.go which finally called from os.NewFile function as same as Linux epoll registration. Here is the function code.

func netpollopen(fd uintptr, pd *pollDesc) int32 {
        // Arm both EVFILT_READ and EVFILT_WRITE in edge-triggered mode (EV_CLEAR)
        // for the whole fd lifetime. The notifications are automatically unregistered
        // when fd is closed.
        var ev [2]keventt
        *(*uintptr)(unsafe.Pointer(&ev[0].ident)) = fd
        ev[0].filter = _EVFILT_READ
        ev[0].flags = _EV_ADD | _EV_CLEAR
        ev[0].fflags = 0
        ev[0].data = 0
        ev[0].udata = (*byte)(unsafe.Pointer(pd))
        ev[1] = ev[0]
        ev[1].filter = _EVFILT_WRITE
        n := kevent(kq, &ev[0], 2, nil, 0, nil)
        if n < 0 {
                return -n
        }
        return 0
}

This is OK for other file types but doesn't work with BPF device file. As far as I researched, BPF device file only supports EVFILT_READ kqueue filter and doesn't support EVFILT_WRITE or not supports kqueue at all. Here is man page list of each OSes.

I also tested it on FreeBSD 12.0 and MacOS X 10.14 with fixed raw package and netpollopen function. On FreeBSD 12.0, if it doesn't try to register EVFILT_WRITE, it registered BPF device file to kqueue. On MacOS X, even with modifying netpollopen, it failed to register BPF device file to kqueue (From XNU MacOS X kernel code, it seems to support BPF read registration but doesn't work, not sure the reason).

The netpollopen function behavior is discussed in Go official in the other context https://github.com/golang/go/issues/19093 so it may be possible in limited BSD platform in the future

mdlayher commented 5 years ago

Thank you for sharing. That is unfortunate; is there any alternative to BPF device files that we could possibly leverage? Hooking into the runtime network poller's epoll support on Linux has been hugely beneficial and it'd be great to not have to bring in a bunch of added infrastructure just for BSD.