Closed scottlamb closed 2 years ago
Oh, there's an actual bug in the new model. On shutdown, we can get panics like the following:
because this code is assuming the tokio runtime that was used to create the session is still around, when it's actually been shut down without waiting for its tasks to complete.
Hmm, the panic on shutdown should be fixed with 7b0a489, but the performance moved further in the wrong direction. With a 1-thread "multi-threaded" tokio reactor on my main setup:
commit | cpu | flamegraph | notes |
---|---|---|---|
ffmpeg (unsure of exact commit) | ~16% | old test results, flamegraph. see here. | |
retina (unsure of exact commit) | ~14% | same commit as above, different commandline parameters. | |
v0.7.2-1-g307a388 (I think) |
~16% | (don't have handy) | old test results, flamegraph. regression with move to separate per-stream single-thread reactor |
v0.7.3-1-g5e7d558 |
~15% | as of today. likely improved due to new Rust version, new tokio versions including this one that promises significant performance improvements, etc. | |
v0.7.3-2-g7b0a489 |
~19% | as of today. another regression | |
v0.7.3-3-g967834c |
~16% | clawed back some performance via a tokio::task::spawn per frame |
Looks like having the block_on
calls on one thread outside the reactor is worse than using the channel for one hand-off per frame. It might be incurring a bunch of handoffs for futures it launches or something.
so I'm going to keep this issue open for now. I could go back to the more performant model in which the stream threads pull from a channel, but that has the downside I mentioned in the https://github.com/scottlamb/moonfire-nvr/commit/7b0a489541c183deadcb03bbe221602d1f5f47e5 commit description that the session could still be open when we start another. And I don't think it's the absolute best model either. I think that's having the streamers started from moonfire_nvr::cmds::run::run
being async and pushing data once per key frame to a single thread for each sample file directory. That requires more work to achieve, but as I've mentioned elsewhere, batching frames will be helpful anyway for audio support.
update: 967834c now does a tokio::task::spawn
per thread, and we're basically back where we were with ffmpeg. Good enough for now.
update: 967834c now does a tokio::task::spawn
per thread, and we're basically back where we were with ffmpeg. Good enough for now.
In 307a388 I tweaked the threading model for the streamers: before when using Retina, they ran the RTSP code on the main tokio runtime and sent frames over a channel to the per-stream writer threads to write to disk. After, they do everything on the per-stream writer threads.
I expected this to modestly improve CPU usage (at least as much as previous experiments to use reduce the multi-thread runtime to 1 thread or use the "current thread runtime") because it has fewer thread hand-offs for each frame and because it doesn't do the multi-thread runtime's spinning that I previously mentioned here.
Instead, it's a bit slower. On one setup instead of CPU falling from ~13–14% (currently using 1 multi thread) to <=12% it rose to ~16%. Not huge but still, worse when I was expecting better. I guess having the more common work of each
recvfrom
syscall being spread out over more threads is worse than having extra handoffs per frame.Ideally I'd be doing writes once per key frame (mentioned previously at #34 and #116). Then it probably really makes more sense to use the channel, and additionally the extra buffering there can keep the RTSP stream from stalling even if disk writes have a moment of poor latency.
I probably won't look at this until after getting rid of
moonfire-nvr config
, which complicates things due to not being inherently async.