twilio / video-quickstart-ios

Twilio Video Quickstart for iOS
https://www.twilio.com/docs/api/video
MIT License
465 stars 178 forks source link

Cast room through AirPlay #473

Closed sylven closed 4 years ago

sylven commented 4 years ago

Hello! I'm trying to make video conferencing app that allows the user to cast the app view or ideally a custom mix of video and audio tracks to an Apple TV through AirPlay.

First of all I'm sorry for the lack of knowledge and for asking questions that might look like they are not directly related with the Video SDK. I've been searching through all the issues and around many many pages to look for something to guide me, but I couldn't find anything. Also the iOS SDK documentation is so poor in my opinion.

Discarding the option of mirroring the mobile screen by selecting it on the iOS UI (unless I could enable it programatically, which doesn't seem possible), I think the easiest option should be capturing the app view with ReplayKit and then send it through AirPlay. In the docs it says 'ReplayKit is incompatible with AVPlayer', which is very unspecific. I'll assume that content played with AVPlayer is not visible in the ReplayKit capture.

I'm trying to explore this path by creating a 'CMSampleBuffer' with 'RPScreenRecorder.shared().startCapture()'. I shall be able to use this buffer to feed a 'AVSampleBufferDisplayLayer' that will allow me to play through AirPlay. This process is looking difficult for me because I can't find any samples or good documentation on how to do it. This looks like it could produce some delay that might make it unviable, but maybe I'm mistaken. On the 'ReplaKitExample' I sense around 1s of delay but I'm thinking it might be because of the downscale applied.

I was also thinking of trying a similar procedure but trying to convert the video and audio tracks from the remote participant into a playable asset, that I could play with AVPlayer or AVSampleBufferDisplayLayer, and then send along through AirPlay. This might perform better but I guess it would become really hard to mix the tracks of different participants in a single asset (I don't know if that's even possible to do in real time). To take this path I think I should implement 'TVIVideoRenderer' and generate the frames with that.

On both approaches I saw I might need to use 'TPCircularBuffer' for the audio.

Do you think this is possible to achieve? Does any of those approaches make sense? Thank you in advance for any info that might throw some light!

EDIT: I was also exploring the option to develop an AppleTV app that could join the room just as a spectator, but Twilio Video SDK is not compatible with tvOS and I saw some developers tried compiling WebRTC to work on tvOS without success. Also tvOS doesn't support WebViews (neither has a web explorer). There is this Client-Server apps that can be built but they seem very limited. So I guess this path should be discarded.

EDIT2: Recently I discovered that to cast local files to Chromecast you need to create a server that is going to serve the media to it. Taking this path could be useful to make the AVPlayer play the media on this server too. I've tested with an HLS server and the delay in LAN it's around 8-9 seconds, which is unviable for this use case. I also found some comparisons between protocols and the best one seems to be RTSP but it's not compatible with either of these. I guess I'll go back trying to have an HTTP server that serves video from only image frames.

ceaglest commented 4 years ago

Hey @sylven,

Sorry that I did not respond about this ambitious project. I can only offer some advice since there is a lot of ground to cover.

Discarding the option of mirroring the mobile screen by selecting it on the iOS UI (unless I could enable it programatically, which doesn't seem possible),

Airplay could work as long as it is playing audio + video only and you don't need to speak to other Participants.

I was wondering if you have tried the Airplay mirroring route? The default audio device might be a problem, but we have a playback only device that might work better with Airplay in AudioDeviceExample.

I'm trying to explore this path by creating a 'CMSampleBuffer' with 'RPScreenRecorder.shared().startCapture()'. I shall be able to use this buffer to feed a 'AVSampleBufferDisplayLayer' that will allow me to play through AirPlay.

Once you have frames from ReplayKit, you could share them as an AVAsset. There is some discussion about creating AVAsset on the fly in: https://github.com/twilio/video-quickstart-ios/issues/237. You will need LL-HLS or the equivalent for MPEG-DASH to get low enough latency.

Do you think this is possible to achieve? Does any of those approaches make sense?

You will need Airplay to cast to an AppleTV you just need to decide if the mirroring route will work or the creation of an AVAsset for streaming. I think that mirroring is going to be the best way to get low delay without a lot of work on your side. Once you have a streamable asset you might also be able to get it to work with a chromecast.

EDIT: I was also exploring the option to develop an AppleTV app that could join the room just as a spectator, but Twilio Video SDK is not compatible with tvOS and I saw some developers tried compiling WebRTC to work on tvOS without success. Also tvOS doesn't support WebViews (neither has a web explorer). There is this Client-Server apps that can be built but they seem very limited. So I guess this path should be discarded.

We were able to compile WebRTC for tvOS and use it in a playback / subscription only way but we never officially added support for tvOS.

Best, Chris

ceaglest commented 4 years ago

Hey,

I hope that my response was helpful even if it was pretty late. Closing this one since I didn't hear back, but feel free to chime in on #237 to continue discussion of creating streaming content.

Best, Chris

sylven commented 4 years ago

Hey Christopher,

Thank you a lot for your answer. Unfortunately it was way too late for me to pick up on some of your recommendations. Hopefully it might be useful to other developers.

Regards!