twilio / video-quickstart-ios

Twilio Video Quickstart for iOS
https://www.twilio.com/docs/api/video
MIT License
465 stars 178 forks source link

Support for PIP mode in iOS #237

Open pooja816 opened 6 years ago

pooja816 commented 6 years ago

Support for PIP mode in iOS

Please add support for PIP mode in iOS

Expected Behavior

When any video call is in process, in between I also want to access other features of the app e.g. - WhatsApp video call

Actual Behavior

When I try to access other features of the app during call, I need to disconnect the call

Video iOS SDK

1.3.4 via CocoaPods

Xcode

9.2

iOS Version

10.0

iOS Device

iPhone 7

ceaglest commented 6 years ago

Hey @pooja816,

As far as I'm aware there is no way to use Picture-in-Picture support with arbitrary video content (such as content drawn by TVIVideoView). If you want PiP like functionality, you would need to add your own support for it at the application level (some sort of draggable UIView which floats on top of your ViewControllers), and you would not be able to display video after backgrounding the app.

For reference, here are Apple's docs on AVPictureInPictureController.

I hope this helps, even if its not the answer that you were looking for.

Best, Chris Eagleston

ceaglest commented 6 years ago

Thinking about your question a little more, there is no reason why the TwilioVideo objects (like a Room) need to be tied to a specific ViewController.

If your goal is to just do more than 1 thing at a time inside your application (forgetting about backgrounding concerns), then you could write a model controller class which is responsible for managing the usage of TwilioVideo and coordinating with your UIViewController(s) to display content, and respond to user interactions. Our sample code uses a single ViewController for simplicity, but there is nothing preventing your app from organizing and managing the SDK objects differently.

ceaglest commented 6 years ago

Hey @pooja816,

I never heard back from you. Are you looking for more comprehensive sample code, or AVPictureInPictureController support so that you can play video outside of an app?

Best, Chris

pooja816 commented 6 years ago

Hi @ceaglest Thanks. I am trying model controller class which is responsible for managing the usage of TwilioVideo and coordinating with my UIViewController(s) to display content.

pooja816 commented 6 years ago

Hi @ceaglest I try to manage through model controller and when the view is disappeared the call functionality is working fine. But to add PIP mode support to traverse in the application during call(some kind of draggable view) is a little bit complex.

Is there any way to add the view in the status bar(UI like when our app move to background and the call continues).

Please help me as I am stuck on this.

ceaglest commented 6 years ago

Hi @pooja816,

Is there any way to add the view in the status bar(UI like when our app move to background and the call continues).

Not that I'm aware of, you shouldn't interact with the status bar directly. Apple's PiP support allows you to play in the background as long as you use AVPlayer. I don't believe the status bar is part of your key UIWindow, but rather something that springboard, or UIKit provides.

I try to manage through model controller and when the view is disappeared the call functionality is working fine. But to add PIP mode support to traverse in the application during call(some kind of draggable view) is a little bit complex.

I don't have an example to share at the moment, but we do plan on offering something more complex which will use multiple view controllers in the future.

I'll be in touch once we have a concrete example.

Best, Chris

pooja816 commented 6 years ago

Hi @ceaglest When 2 users are connected in a room, the call goes on properly. But I am stuck on the following-

  1. When one user accesses other features of the app during the call, how to pause the video? If I do this localVideoTrack?.isEnabled = false;, then black screen appears to the other user.
  2. After accessing other features of the app, when I come back to the video call screen, the remote video is not rendered properly. I am not getting callback on addedVideoTrack of TVIParticipantDelegate method so the remote video is not rendered.
pooja816 commented 6 years ago

Hi @ceaglest Any update for the same?

ceaglest commented 6 years ago

Hi @pooja816,

Sorry for the long delay, somehow I missed your last response.

When one user accesses other features of the app during the call, how to pause the video? If I do this localVideoTrack?.isEnabled = false;, then black screen appears to the other user.

What kind of Room are you using? Disabling a LocalVideoTrack will cause black frames to be sent in a Peer-to-Peer Room, but in a Group Room you should get the final frame and no black frames when pausing.

After accessing other features of the app, when I come back to the video call screen, the remote video is not rendered properly. I am not getting callback on addedVideoTrack of TVIParticipantDelegate method so the remote video is not rendered.

Can you provide more information here? What is the sequence of events from our SDK? Did you get a track removed callback? How do you manage your renderer when pushing / popping more view controllers?

Best, Chris

pooja816 commented 6 years ago

Hi @ceaglest

What kind of Room are you using? Disabling a LocalVideoTrack will cause black frames to be sent in a Peer-to-Peer Room, but in a Group Room you should get the final frame and no black frames when pausing.

Yes, I am using Peer-to-Peer Room. How to create a Group Room?

Did you get a track removed callback?

No, I am not getting callback on track removed. How to remove a track?

How do you manage your renderer when pushing / popping more view controllers?

When the call starts for the first time, the room is created and the participant is added to the room and when the call connects, the callback is received on addedVideoTrack and the remote video of the other participant is rendered. On popping the videoCall view, I am only disabling localVideoTrack. localVideoTrack?.isEnabled = false;

And when I move to video call screen again, I am initialising the localVideoTrack using following code-

   `func startPreview() {
        if PlatformUtils.isSimulator {
          return
    }
      // Preview our local camera track in the local video preview view.
      camera = TVICameraCapturer(source: .frontCamera, delegate: self)
      localVideoTrack = TVILocalVideoTrack.init(capturer: camera!)
      if (localVideoTrack == nil) {
         logMessage(messageText: "Failed to create video track")
      } else {
        // Add renderer to video track for local preview
        localVideoTrack!.addRenderer(self.previewView)

        logMessage(messageText: "Video track created"
    }
}`
piyushtank commented 6 years ago

@pooja816 Thanks for responding back with information. Chris is on vacation for the rest of the week, in the meantime, let me try to help solve the problem -

Yes, I am using Peer-to-Peer Room. How to create a Group Room?

Here is the REST API and documentation on how to create group Room. You can use Type=group while creating the Room.

No, I am not getting callback on track removed. How to remove a track?

You can remove a video track by calling [localParticipant unpublishVideoTrack:localVideoTrack] however, you may not want to reamote the video track for your use case. When you move outside of video screen, you can call localVideoTrack?.isEnabled = false, now Assuming you have not removed renderer from the video track, when you move back to the call screen, you should set it back to true localVideoTrack?.isEnabled = true.

Let me know if you have any questions.

pooja816 commented 6 years ago

Hi @piyushtank Thanks for your reply. When the call starts for the first time, the video call goes on properly. But when the user again come back to the video calling screen after accessing other features of the app, the remote video is not shown. Here is the Demo what I have tried in VideoCallKitQuickStart project.

piyushtank commented 6 years ago

@pooja816 Thanks for sharing the code. I will try it out and let you know my findings.

piyushtank commented 6 years ago

@pooja816 I tried the sample code. I noticed you have added ViewController1 as a root view controller and embedded view controllers into the navigation view controller.

When you press on new call and connect to a Room, ViewController2 gets pushed on top of ViewController1. When you are connected to a Room, it displays two buttons "Back" and "Settings". All ok till this point.

Now you can observe call works as expected while comming back from Settings screen but does not give expected results while returning from ViewController1. Here is the reason: When you press the "Back" button, the existing ViewController2 gets destroyed. And since ViewController2 owns Room, Room and all tracks get destroyed, and it gets disconnected from the Room. So, when you are coming back to call screen by pressing the "Continue" button on ViewController1, the app creates a new instance of ViewController2 and pushing it on the top.

On the other end, when you press the "Settings" button, ViewController2 does not get destroyed and the SettingsViewController gets pushed on top. So while coming back to ViewController2 from SettingsViewController, call remains connected because it is using the existing ViewController2.

This is iOS UI Kit behavior, please see [this].(https://developer.apple.com/documentation/uikit/uinavigationcontroller) for more information.

Also, you should not post token or any secret keys while sharing the code in a public place.

Please let me know if you have any questions.

pooja816 commented 6 years ago

Hi @piyushtank

When you press the "Back" button, the existing ViewController2 gets destroyed. And since ViewController2 owns Room, Room and all tracks get destroyed, and it gets disconnected from the Room.

Yes, when press back, ViewController2 gets destroyed. But the room is owned by CallManager and while coming back again to the ViewController2, I am getting the same instance of Room(CallManager.shared.room).

Please guide me how to access other features of the app during video call and while coming back to the video call screen, everything work fine.

piyushtank commented 6 years ago

@pooja816 Correct, I also noticed that the Room is stored in CallManager. Unfortunately, SDK does not allow resetting the room.delegate so you will be required to store the previous ViewController and use it.

I have fixed the app to access other features of the app during video call and while coming back to the video call screen. See this.

Let me know if you have any questions.

pooja816 commented 6 years ago

@piyushtank Thanks. It is working fine. But is it good to save the viewController instance?

KrisConrad commented 6 years ago

Sorry to dig up an old issue, but is there any way, or future plans, to support AVPictureInPictureController? While a custom in app PIP mode is doable, I was hoping to leverage AVPictureInPictureController so the user is able to navigate outside of the app and keep their video call in PIP mode.

ceaglest commented 6 years ago

Hi @KrisConrad,

We don't have a solution to this one at the moment.

The problem is that AVPictureInPictureController only works with AVPlayerLayer, and we can't provide our content directly to AVPlayer to participate in this process. One thought is, could we make a TVIVideoRenderer that serves video content to AVPlayer via AVURLAsset? I don't know if its possible to package up I420 or NV12 frames into a format which is streamable over a network without actually encoding as H.264 etc.

Ideally, apple would extend AVPictureInPictureController to support other layer classes, like AVSampleBufferDisplayLayer.

Regards, Chris

ceaglest commented 4 years ago

Closing, as supporting PiP isn't on our roadmap for 2020. If Apple offers improved support for AVPictureInPictureController on iPad OS 14 then we may revisit this issue.

tmm1 commented 4 years ago

I reported FB7747223 to apple (on feedbackassistant.apple.com):

AVPictureInPictureController does not work with AVSampleBufferDisplayLayer I have an iOS app that uses AVSampleBufferDisplayLayer for playback. I would like to integrate the new PIP apis on iOS 14, but I cannot use PIP because AVPictureInPictureController only accepts AVPlayerLayer

ceaglest commented 4 years ago

I reported FB7747223 to apple (on feedbackassistant.apple.com):

Thanks! If Apple accepts my lab appointment for tomorrow I will mention your feedback.

mcorner commented 4 years ago

@ceaglest Does ios14 change anything here? Or just bring the iPad functionality to the iPhone? We would be interested in putting video calls into PIP.

ceaglest commented 4 years ago

@mcorner It looks like PiP functionality is coming to the iPhone but with the same constraint of the content being an AVPlayerLayer.

I will be in the AVFoundation lab later today to discuss it with Apple. If you constrain a Room to use H.264 then any of the video content could in theory be played by an AVPlayer if it were remuxed to a transport stream and provided to an AVAsset with a custom protocol. It's just in memory copies but the delay might be too much for this technique to work.

Edit: I am speaking about private Video APIs where I have access to the decrypted H.264 NALUs before they are decoded to a CVPixelBuffer. There aren't public APIs at this moment, but imagine a Renderer that receives CMSampleBuffers.

tmm1 commented 4 years ago

If Apple accepts my lab appointment for tomorrow I will mention your feedback. I will be in the AVFoundation lab later today to discuss it with Apple.

Really appreciate it! I tried to get an appt but missed the window. I will try again for the normal AVFoundation lab later this week.

If you get a chance, could you also mention this issue which affects HDR playback when using AVSampleBufferDisplayLayer:

FB7427457: HDR video not rendered with correct colors when using AVSampleBufferDisplayLayer with uncompressed CVPixelBuffers from VideoToolbox

tmm1 commented 4 years ago

@ceaglest Anything useful in yesterday's lab appt?

ceaglest commented 4 years ago

Hey @tmm1,

There isn't a trivial solution like using AVSampleBufferDisplayLayer, and Apple's engineers couldn't say if they would add it in the future. They asked me to file a feedback in addition to yours.

They did make an interesting suggestion, but there are no guarantees of success. In iOS 14 AVAssetWriter can write fragmented mp4 files natively. The idea is to pass-through (or re-encode) a video track using AVAssetWriter and produce a fragmented mp4. This file is provided as an AVAsset in memory by implementing an AVAssetResourceLoaderDelegate, and authoring a custom playlist for the content. Everything occurs in roughly real-time as the video frames come in, and HLS is delivered to AVPlayer as mp4 fragments are produced. The segmentation is going to determine how much latency there is between the app and playback of the PiP feed.

I am trying to implement this based upon:

And starting with access to the decrypted H.264 NALUs. I'll let you know how it goes. If you want to try it yourself with raw frames you will have to provide compression settings to the AVAssetWriterInput.

Best, Chris

tmm1 commented 4 years ago

@ceaglest I had a crazy idea this morning which may work (or may fail spectacularly): what if we made a dummy AVPlayerLayer, then added a AVSampleBufferDisplayLayer to it as a sub layer?

tmm1 commented 4 years ago

FYI I tried my sublayer idea and it didn't work.

I'm curious if you ever got anywhere with the fmp4 resource loader concept?

mark-veenstra commented 4 years ago

iOS 14 has added the new Picture in Picture feature: https://developer.apple.com/documentation/avkit/adopting_picture_in_picture_in_a_custom_player

ceaglest commented 4 years ago

Hey @mark-veenstra,

iOS 14 has added the new Picture in Picture feature:

Definitely, that is the subject of our recent discussion. Unfortunately your custom player still has to use AVPlayerLayer in iOS 14.

Hi @tmm1,

FYI I tried my sublayer idea and it didn't work. I'm curious if you ever got anywhere with the fmp4 resource loader concept?

I made some progress in terms of re-encoding live video content from a TVIVideoRenderer to H.264 using a very short GOP. This is a good start, but I haven't had a chance to implement the resource loader yet. Without the resource loader the live portion doesn't actually work, but I can view the recorded bitstreams for later. I'll let you know if I get something working with PiP.

Best, Chris

tmm1 commented 4 years ago

This is a good start, but I haven't had a chance to implement the resource loader yet. Without the resource loader the live portion doesn't actually work, but I can view the recorded bitstreams for later. I'll let you know if I get something working with PiP.

Cool. Good luck with ResourceLoader. I tried to do the same thing with HLS TS segments, but it turns out you cannot use the resource loader to serve these (because AVFoundation must be able to download them directly in order to measure bandwidth consumption for bitrate switching).

I'm not sure if that same caveat applies to fMP4, but I hope not!

See https://developer.apple.com/forums/thread/113063?answerId=613266022#613266022

ceaglest commented 4 years ago

Thanks for the heads up! I posted a comment on that thread just in case the Apple engineer is still monitoring it. Could be tricky I guess, I would rather not host an https webserver in order to get this working. (Hilariously, I think this would require local network permissions in iOS 14.)

Edit: Actually, 127.0.0.1 / localhost / loopback is exempt from the permissions check in iOS 14.

tmm1 commented 4 years ago

Welp I guess there's bad news from Apple:

Yes, the same limitation exists with any media segment (TS, fmp4, packed audio, WebVTT, whatever).

You should write a bug describing your use case to ask for AVPictureInPictureController support in AVSampleBufferDisplayLayer.

Until you get that you can look into using LL-HLS vended from an HTTP server running on the device. HLS isn't really suited to low-latency applications but LL-HLS is much better.

Note that there's no particular reason that a web server needs to hit the disk. At the end of the day it's just an application that is listening on a TCP port and responding to HTTP GET requests.

ceaglest commented 4 years ago

Oh well, at least they were nice enough to respond. I sometimes write HTTP servers for testing, I guess it wouldn't be that hard using Network.framework in order to host a server and have it return the content from memory. But still, its adding more complexity to an already complex solution. Not giving up entirely on the idea, but its a bit of a setback.

tmm1 commented 4 years ago

I've used https://github.com/swisspol/GCDWebServer before to serve up TS segments. Definitely not ideal, but it gets the job done.

dmallory commented 4 years ago

Has there been any further progress on this or some starting point code for the pieces working so far? We were hoping to enable this around iOS 14 launch to support PiP for our communication app as well.

ceaglest commented 4 years ago

Hi @dmallory,

Thank you for the interest in this feature. At the moment resolving breaking issues (like Local Network Permissions) in iOS 14 is our top priority before we get to new things.

We are still in the putting the pieces together phase here. Since I last wrote, I've also tried using a live LL-HLS segmenter running on my local network which takes the camera on my mac and rebroadcasts it to the PiP in iOS 14. The segmenter was ffmpeg, and I may need to tune some settings to improve things but the delay was still in the seconds range. Using smaller segments, lower chunk sizes, or target duration might help improve latency. Ideally we could get to whatever the theoretical minimum is for LL-HLS with zero latency networking.

I will post more updates once we are in a better spot regarding iOS 14 networking, but as you can imagine I can't guarantee this feature in time for the iOS 14 launch while more critical issues are outstanding.

Best, Chris

dmallory commented 4 years ago

Hi @dmallory,

Thank you for the interest in this feature. At the moment resolving breaking issues (like Local Network Permissions) in iOS 14 is our top priority before we get to new things.

We are still in the putting the pieces together phase here. Since I last wrote, I've also tried using a live LL-HLS segmenter running on my local network which takes the camera on my mac and rebroadcasts it to the PiP in iOS 14. The segmenter was ffmpeg, and I may need to tune some settings to improve things but the delay was still in the seconds range. Using smaller segments, lower chunk sizes, or target duration might help improve latency. Ideally we could get to whatever the theoretical minimum is for LL-HLS with zero latency networking.

I will post more updates once we are in a better spot regarding iOS 14 networking, but as you can imagine I can't guarantee this feature in time for the iOS 14 launch while more critical issues are outstanding.

Best, Chris

Thanks for the update. For fun I also tried doing this at the last mile using ReplayKit - scrape the video from the current layer, encode and rebroadcast with a local server to a supported layer - and it's as bad as you would expect, though a fun demo. Keeping accurate realtime audio and video will always be most important, but I'm excited to look at anything that gets the latency down enough where that looks possible.

ceaglest commented 4 years ago

Interesting, thank you for sharing your prototyping experience πŸ‘.

magneticrob commented 3 years ago

Hey @dmallory, as most of the iOS14 issues appear to be resolved now, has there been any update on PIP support for Twilio Video?

tmm1 commented 3 years ago

You should write a bug describing your use case to ask for AVPictureInPictureController support in AVSampleBufferDisplayLayer.

Sounds like this is coming in iOS 15

F6EE7F0C-EB0A-4016-B4F7-3FAD0D112485

http://codeworkshop.net/objc-diff/sdkdiffs/ios/15.0/AVKit.html

ceaglest commented 3 years ago

Most definitely. Very good news!

There was an old prototype of a renderer using AVSampleBufferDisplayLayer. It sounds like there is now a very good reason to use this rendering technique in a production application.

https://github.com/twilio/video-quickstart-ios/pull/286

tmm1 commented 3 years ago

FYI, I couldn't get the new PIP api to work. Not sure if I'm doing something wrong or its broken in the current betas

Screen Shot 2021-06-08 at 4 47 37 PM
ceaglest commented 3 years ago

Not sure, I will try locally and am attending an AVKit lab tomorrow to find out more.

21-Hidetaka-Ko commented 3 years ago

@tmm1 @ceaglest Were you able to solve the problem here and implement PIP mode?γ€€I am having the same problem.

tmm1 commented 3 years ago

It is apparently unimplemented in tvOS 15 Beta 1.

21-Hidetaka-Ko commented 3 years ago

@tmm1 Thanks. Does this mean that we won't be able to implement it this fall when IOS15 is released? Do you think you'll be able to implement it this fall?γ€€ Do you think it will be ready for implementation this fall, and do you have any way to confirm this?

21-Hidetaka-Ko commented 3 years ago

Most of the discussion about Picture in Picture is about video playback services like Youtube and Netflix, but I'm more interested in Picture in Picture mode for calling apps like Facetime.

There are two points I'm interested in. 1.Is it possible to implement Picture in Picture mode in calling apps before iOS15? 2.Is it likely to be possible to implement Picture in Picture mode in the calling app after iOS15 this fall? 3.Is it possible to mute my own voice or someone else's voice while the call screen is in Picture in Picture mode? With Facetime, you can't switch the mute in Picture in Picture mode. You need to jump to Facetime once.

21-Hidetaka-Ko commented 3 years ago

If Twitch and Netflix can support Picture in Picture mode, but other video calling apps can't, is it because they're not allowed by Apple?γ€€Or is it just a lack of resources?