tauri-apps / wry

Cross-platform WebView library in Rust for Tauri.
Apache License 2.0
3.5k stars 260 forks source link

Fix `getDispayMedia()` `getUserMedia()` permission prompt on macOS #1195

Open pewsheen opened 5 months ago

pewsheen commented 5 months ago

Current Status on macOS 14

but with requestMediaCapturePermissionForOrigin delegated in wry

A possible fix for now is delegating both requestMediaCapturePermissionForOrigin and requestDisplayCapturePermissionForOrigin. But when we are calling getDispayMedia(), it can choose a screen only.

Upstream tracking: https://bugs.webkit.org/show_bug.cgi?id=271688

Workaround PR: https://github.com/tauri-apps/wry/pull/1196

--- outdated ---

The media permission prompt was suppressed when macOS 14.0 introduced

Issue: https://github.com/tauri-apps/wry/issues/1101, https://github.com/tauri-apps/tauri/issues/2600 PR: https://github.com/tauri-apps/wry/pull/1111

note: #1111 brings a regression issue: https://github.com/tauri-apps/tauri/issues/9178

After an OS update, which could be 14.2 (not sure), the original issue was solved by Apple, so we revert the fix:

Issue: https://github.com/tauri-apps/tauri/issues/8979 PR: https://github.com/tauri-apps/wry/pull/1186

We need to find the affected OS version (14.0 ~ 14.x.x) and apply the version check like https://github.com/tauri-apps/wry/pull/1111

TODO

FabianLars commented 5 months ago

i think i can test 14.1.2 later today

FabianLars commented 5 months ago

I can confirm that it only works on 14.1.2 after i changed (14, 0, _) to (14, 0, _) | (14, 1, _) (is that the best way to do the pattern matching? πŸ˜…)

FabianLars commented 5 months ago

Looks like we already know that it's fixed in 14.2.1 if 8979 is indeed the same issue so i'm just gonna try to check 14.2.0 and not the latest 14.2 patch ver

FabianLars commented 5 months ago

Also a bit offtopic but the way we handle getDisplayMedia requests in general is a bit incorrect. We're never actually asking for permissions, the dialog that shows is just the display/window selection screen - firefox equivalent for reference:

grafik

In firefox, if you select a window and then click on allow, then it will ask for permissions:

grafik

(sorry for german, just test https://webrtc.github.io/samples/src/content/getusermedia/getdisplaymedia/ yourself).

Afaik this is because the macos permission system, at least for screen recording, is on a "pretty please 😒" basis, meaning that apps/frameworks have to implement, read and adhere to permissions themselves. And yes, they can chose not to do this (which we effectively did too), though idk if that will cause problems in the app store. I remember you also said something like "I messed up my environment and it never asks for permissions" which very likely wasn't the case and/or maybe you're confusing it with microphone permissions which iirc are handled more correctly.

pewsheen commented 5 months ago

Hmm I think I also found something...

It's getDisplayMedia and getUserMedia

https://github.com/tauri-apps/tauri/issues/2600 -> getDisplayMedia -> not showing prompt https://github.com/tauri-apps/tauri/issues/8979 -> getUserMedia -> showing duplicate prompt

So if we overwrite the responder then getDisplayMedia will not working, but getUserMedia will be fine. If we don't, getDisplayMedia will work but getUserMedia will show prompt for application and webview everytime.

It might not cause by the os version, just different Web API has different OS behavior

FabianLars commented 5 months ago

Okay, so i went through my old notes from a while ago and together with the new findings i'm fairly confident that the only way to fix both getUserMedia and getDisplayMedia is by using 2 private webview apis. (What's also weird is that you must have the plist for mic/cam usage for display media to work but i don't want to think about that lol)

One thing i don't know yet is if macos 13 and 14 (and maybe 12) really behaved differently or if we just think/thought it did. I can look into that later, for now let's focus on 14.

I will push my wip experiment onto your branch for you to try. This still has the no-permission behavior explained in my comment above but it should be fairly easy to add (i had that in my prior experiments a few months ago but sadly deleted the branch because i thought it was indeed fixed now and my changes were just over-engineered 😒 )

FabianLars commented 5 months ago

Okay so just like last time i wasn't able to actually get it working (i still had stuff commented out in my comment above πŸ˜‚ ). I still pushed it to your branch for you to look at. I also added a few comments. Most of it is based on https://github.com/SafeExamBrowser/seb-mac (for example https://github.com/SafeExamBrowser/seb-mac/commit/3b05273116799d300083aed4c013117e89178640).

My knowledge about obj-c, or at least obj-c in a rust project, is too lacking to finish this so i hope that you can figure it out and that what i pushed may help you.

pewsheen commented 5 months ago

Ok.... so it's my turn to become a helldiver

Basically, I made no progress today 😭 Here are some of my guesses so far:

The permission are all passthrough something called UserMediaPermissionRequestProxy.

Normally, without providing requestMediaCapturePermissionForOrigin, UserMediaPermissionRequestProxy will call some API to bring up ScreenCaptureKit to show the (window/screen) selection panel, and then take the video input you select to ask for camera/microphone permission.

But, if we provide requestMediaCapturePermissionForOrigin function, it somehow skips the ScreenCaptureKit step and brings an empty video and audio list to ask for system permission and is rejected...

Also, creating a new project in Xcode and providing requestMediaCapturePermissionForOrigin will have the same issue.

I'll continue the research next few days, but if there's no solution, we might need to set a stop-loss point.

FabianLars commented 5 months ago

I'll continue the research next few days, but if there's no solution, we might need to set a stop-loss point.

While i do hope you'll find a solution i think it's fine if we end up having a config that switches between disabling the double permission prompt for mic/cam or supporting getDisplayMedia. Since officially getDisplayMedia isn't supported in wkwebview i think this is an okay drawback until we have an alternative to wkwebview or figure out a solution.

pewsheen commented 5 months ago

I think something is wrong with https://github.com/WebKit/WebKit/blob/a0b7f7faeffb5ec679ddd471a0f7c68a91a37715/Source/WebKit/UIProcess/Cocoa/UIDelegate.mm#L1212, so it didn't show the picker.

Currently, the best we can do is add requestDisplayCapturePermissionForOrigin and provide getDisplayMedia() to select screen or window only while we still able to suppress duplicate prompt when calling getUserMedia(). @FabianLars what do you think?

FabianLars commented 5 months ago

If we expose the decision handler for requestDisplayCapturePermissionForOrigin on the webview builder then this sounds really solid to me. We just have to make it possible for the app's enduser to choose between window and monitor mode, doesn't make sense for 99.9% of apps to leave that to the app dev (or worse, us). In tauri for example we'll probably just add a built-in use of rfd and call it a day.