Closed fako1024 closed 1 year ago
Note: This is on my local working branch of #88 and uses the changes in this PR: https://github.com/fako1024/slimcap/pull/34
Raised https://github.com/fako1024/slimcap/issues/35 to cover packet counter issue.
@els0r I was playing around a little: The first two are quite easily solvable without introducing any performance penalty (albeit with a bit of a cringe because the require additional synchronization outside of the existing state machine / run group stuff). The slimcap
counter issue I can easily solve as well. The writeout channel stuff is a bit more contrived from what I see. I'll keep you posted...
More issues (or rather details) found when assessing:
StatusAll()
on the CaptureManager
, which unfortunately also resets the counters, and if a rotation happens some time in between they get reset as well, making it impossible to actually track what has been processed)...State changes are not atomic / guarded by the existing mutex (at least not everywhere):
func (c *Capture) setState(s State) {
c.state = s
c.ctx = logging.WithFields(c.ctx, "state", s.String())
// log state transition
logger := logging.FromContext(c.ctx)
logger.Debugf("interface state transition")
}
I encountered this when trying to extract the state of each capture (not Status, since that also resets the counters) using a RunGroup
similar to all the other ***All()
methods of the CaptureManager
. Also, I'm afraid the concept of capturesCopy
does not prevent race conditions.
First off: thanks for looking into this and embracing the ugly. As discussed, let's try to rework this part and keep only what we need.
The thing with the logger is concerning. I wonder if it has to do with using c.ctx
vs. a regular context. Definitely something we should look out for when restructuring the code base.
I want to get rid of storing the context inside the structs capture
and captureManager
and make it part of the function calls whenever possible.
First off: thanks for looking into this and embracing the ugly. As discussed, let's try to rework this part and keep only what we need.
The thing with the logger is concerning. I wonder if it has to do with using
c.ctx
vs. a regular context. Definitely something we should look out for when restructuring the code base.I want to get rid of storing the context inside the structs
capture
andcaptureManager
and make it part of the function calls whenever possible.
Context makes sense given the stack trace from the race, so maybe that's the culprit indeed. Nice digging!! :muscle:
After straightening out a couple of issues found in #88 via https://github.com/fako1024/slimcap/issues/33 running the E2E test on a basic flow (single interface, data being piped through a mock ring buffer source from a pcap file) revealed several data races, all of which seem to be related to the
CaptureManager
(respectively the state machine of the individual captures). The following ones I have found:Free()
is called Way back I think we had a quick chat about whether it's ensured that the capture is actually stopped whenclosing()
is run. According to the test this doesn't seem to be the case:Previous read at 0x00c0000d4940 by goroutine 23: github.com/fako1024/slimcap/capture/afpacket/afring.(Source).nextPacket() /home/fako/Develop/go/src/github.com/fako1024/slimcap/capture/afpacket/afring/afring.go:331 +0x5e github.com/fako1024/slimcap/capture/afpacket/afring.(Source).NextPacket() /home/fako/Develop/go/src/github.com/fako1024/slimcap/capture/afpacket/afring/afring.go:136 +0x57 github.com/fako1024/slimcap/capture/afpacket/afring.(*MockSource).NextPacket()