Closed iceblue1680 closed 7 years ago
Hi @iceblue1680, thanks for the report. Is there any further instrumentation you can run, since you're already set up? For example, CPU or memory usage, or using instruments to see which methods might be taking a long time? This is on our list of bugs, but we have a few releases we're working on and any further information would help us narrow it down and get it fixed sooner.
Hi @joeljfischer , thanks for the quick response. It will need some time. For now, we found that in the v4.6.0 release note updates the iAP transport to run on dedicated background threads, is that the possible reason?
Hi @iceblue1680, we investigated and found that it does appear to be slightly smoother on v4.5. This is the only change between v4.5 and v4.6 that may have an effect.
We will have to do additional investigation when possible, hopefully before v5.0 releases, or if you can do additional investigation, please let us know the results.
@davidswi @shoamano
I've taken a look at this at lunch and it seems that as @iceblue1680 pointed out above, moving the iAP transport to the bg threads is what changed. After some investigation, threads can be assigned priority as shown here.
One test that I performed (in a very non-scientific way) is by setting the iAP background thread to the maximum priority in SDLIAPSession.m
(around line 70) like so:
// Start I/O event loop processing events in iAP channel
self.ioStreamThread = [[NSThread alloc] initWithTarget:self selector:@selector(sdl_accessoryEventLoop) object:nil];
[self.ioStreamThread setThreadPriority:1.0];
[self.ioStreamThread setName:IOStreamThreadName];
[self.ioStreamThread start];
Whereas before there was no priority parameter and I would guess the system itself was assigning some value depending on available resources, etc.
I noticed an improvement with that (again, non-scientific) but it may help.
The original proxy implementation used a tight polling loop for writes within a dispatch block enqueued on an output dispatch queue. This implementation allowed the main thread to service its UI, but violated the requirement that streams can only be accessed from the same thread and run loop they are scheduled on. It is possible that by jamming data into the output stream in a polled fashion might help video performance, along with setting the thread priority and quality of service. I would prefer not to do this as it will contribute to significant battery drain, particularly during navigation when the CPU is constantly waking to service GPS updates and the nav app is consuming considerable resources to perform routing.
Nav apps currently require the phone to be plugged into USB on both platforms currently, so battery drain shouldn't be as much of an issue (its being charged at that time) as streaming performance to the user.
Thanks Brett, I think we both know of reports from end users complaining that the battery is draining or charging slowly even when connected to USB. Also, the goal is to allow wireless streaming in the near term. We want to be as light touch as possible because optimizing for light touch typically yields good performance improvements in all cases.
@davidswi Understood where you're coming from, but streaming performance is significantly degraded compared to previous performance levels. We certainly want to stick to using a dedicated, non-UI/main thread, but we're looking for ways to safely improve this performance. The current performance compared to previous performance is pretty unacceptable. So we need ways to improve it. I think that until we can get metrics on actual battery drain we should consider any and all options to improve performance.
From our analysis of video streaming, the main performance bottle neck is in the C++ container copy and expand operations in the SDLCore as buffers are assembled into frames and messages.
OK @joeljfischer, I hear that. Let's do an Instruments run on both implementations and see where the hotspots are.
And then let's follow the guidelines above to optimize.
Also, there are some guidelines about the best ways to capture OpenGL and UIKit/CALayer based content. First, renderInContext has the best performance characteristics for UIKit/CALayer content as discussed here: https://github.com/radi/LiveFrost/issues/10 Second, for OpenGL capture, this StackOverflow topic is of interest: https://stackoverflow.com/questions/44792778/how-to-record-video-from-arkit
I found possible cause of this performance degradation. Created PR #754.
Fixed in v4.7.3
We found that video streaming performance degraded when integrated with v4.6.1 than previous version v4.5.5.
Please see the attached video: v4.5.5.mov.zip v4.6.1.mov.zip
iOS Version: 10.3.3