Add zero-copy mode for ring buffer source

Currently data is copied from the ring buffer, no matter if a buffer has been provided to NextPacket()/ NextIPPacket(). Since the data in the ring buffer is invalidated only upon the next call to those methods we can re-add a real zero-copy mode and then doing:

// Populate the packet
data = s.curTPacketHeader.payloadNoCopyAtOffset(uint32(s.ipLayerOffset), uint32(snapLen))
//s.curTPacketHeader.payloadCopyPutAtOffset(data, uint32(s.ipLayerOffset))

where

func (t tPacketHeader) payloadNoCopyAtOffset(offset, to uint32) []byte {
    mac := uint32(*(*uint16)(unsafe.Pointer(&t.data[t.ppos+24])))
    return t.data[t.ppos+mac+offset : t.ppos+mac+to]
}

Similar logic holds for the non-ringbuffer source, where a copy is made as well in normal mode in both aforementioned methods.

DoD

[x] ~~Add ZeroCopy() option to both afpacket sources~~
[x] Implement zero-copy operations for ~~NextPacket()/~~ NextIPPacket() (both afpacket sources)
[x] Consider option to provide performance boost for full payload extraction (not possible for NextPacket()
[x] Extend tests to cover the new functionality

Switching to zero-copy mode for NextIPPacket() (and using a buffer to populate) brings the performance of that method almost on par with the functional call:

                                     │       sec/op        │   sec/op     vs base                │
CaptureMethods/NextPacket-4                    151.5n ± 1%   149.1n ± 1%   -1.58% (p=0.000 n=10)
CaptureMethods/NextPacketInPlace-4             56.77n ± 0%   56.32n ± 0%   -0.80% (p=0.000 n=10)
CaptureMethods/NextIPPacket-4                  139.1n ± 1%   136.2n ± 2%   -2.08% (p=0.005 n=10)
CaptureMethods/NextIPPacketInPlace-4           52.77n ± 1%   39.54n ± 0%  -25.07% (p=0.000 n=10)
CaptureMethods/NextPacketFn-4                  38.57n ± 0%   38.23n ± 0%   -0.89% (p=0.000 n=10)

The method NextPacketInPlace currently cannot be changed to use zero-copy mode because it prefixes the raw payload with the capture.Packet header (containing packet direction, the packet length and the offset to the IP layer). This could only be remedied by changing the interface to something resembling the call to NextIPPacket() (and skipping the whole capture.Packet wrapping (or, with less impact: Add a new method, e.g. NextPayload()).

After some deliberation there's quite a lot of different combinations (and, at the same time, limitations), so the best I can come up with that will be minimal but still cover all ways while also being explicit enough is the following interface(s) (omitting non-packet interface methods):

// Source denotes a generic packet capture source
type Source interface {

    // NextPacket receives the next packet from the wire and returns it. The operation is blocking. In
    // case a non-nil "buffer" Packet is provided it will be populated with the data (and returned). The
    // buffer packet can be reused. Otherwise a new Packet is allocated.
    NextPacket(pBuf Packet) (Packet, error)

    // NextPayload receives the next packet's payload from the wire and returns it. The operation is blocking.
    // In case a non-nil "buffer" byte slice / payload is provided it will be populated with the data (and returned).
    // The buffer can be reused. Otherwise a new byte slice / payload is allocated.
    NextPayload(pBuf []byte) ([]byte, byte, uint32, error)

    // NextIPPacket receives the next packet's IP layer from the wire and returns it. The operation is blocking.
    // In case a non-nil "buffer" IPLayer is provided it will be populated with the data (and returned).
    // The buffer can be reused. Otherwise a new IPLayer is allocated.
    NextIPPacket(pBuf IPLayer) (IPLayer, PacketType, uint32, error)

    // NextIPPacketFn executes the provided function on the next packet received on the wire and only
    // return the ring buffer block to the kernel upon completion of the function. If possible, the
    // operation should provide a zero-copy way of interaction with the payload / metadata.
    NextPacketFn(func(payload []byte, totalLen uint32, pktType PacketType, ipLayerOffset byte) error) error
}

// SourceZeroCopy denotes a generic packet capture source that supports zero-copy operations
type SourceZeroCopy interface {

    // NextPayloadZeroCopy receives the next packet's payload from the wire and returns it. The operation is blocking.
    // The returned payload provides direct zero-copy access to the underlying data source (e.g. a ring buffer).
    NextPayloadZeroCopy() ([]byte, error)

    // NextIPPacketZeroCopy receives the next packet's IP layer from the wire and returns it. The operation is blocking.
    // The returned IPLayer provides direct zero-copy access to the underlying data source (e.g. a ring buffer).
    NextIPPacketZeroCopy() (IPLayer, PacketType, uint32, error)
}

This way,

All options are available to the caller, from a "high level" convenience object (Packet), via the raw payload and the IPLayer (removing the necessity to worry about the interface type) and the functional approach via NectPacketFn (which is provided in a "best effort" manner w.r.t. zero-copy operations but guarantees the optimal way for the source).
If a source explicitly supports zero-copy operations (like a memory-mapped ring buffer) they are provided explicitly via the SourceZeroCopy interface and are hence guaranteed (with all the implications like the fact that the caller can only receive the next packet once done with all operations on the current one).
Zero-copy operations are function specific (not per source via e.g. a functional option) because not all functions can actually use zero-copy, even if the source supports it (e.g. NextPacket() because it uses a custom data encoding that cannot be mapped directly to the source).

Making the zero-copy operations explicit now has the advantage of not having to support both paths in one function, which actually brings up the performance of those ops even outpacing the functional approach (total numbers are not comparable to the ones further up, different machine):

                                      │     sec/op     │
CaptureMethods/NextPacket-4               70.07n ± 21%
CaptureMethods/NextPacketInPlace-4        30.64n ±  0%
CaptureMethods/NextPayload-4              60.95n ±  0%
CaptureMethods/NextPayloadInPlace-4       20.04n ±  1%
CaptureMethods/NextPayloadZeroCopy-4      18.13n ±  0%
CaptureMethods/NextIPPacket-4             62.18n ±  4%
CaptureMethods/NextIPPacketInPlace-4      27.40n ±  1%
CaptureMethods/NextIPPacketZeroCopy-4     17.34n ±  0%
CaptureMethods/NextPacketFn-4             18.52n ±  1%

fako1024 / slimcap

Add zero-copy mode for ring buffer source #18