web-platform-tests / wpt

Test suites for Web platform specs — including WHATWG, W3C, and others
https://web-platform-tests.org/
Other
5.02k stars 3.12k forks source link

`sudo safaridriver --enable` not working on macOS Catalina #21751

Open foolip opened 4 years ago

foolip commented 4 years ago

In https://github.com/web-platform-tests/wpt/pull/21628 I attempted to upgrade the Azure Pipelines VMs for macOS 10.14 (Mojave) to 10.15 (Catalina). It didn't work because it's not possible to enable remote automation for STP or Safari stable. I've filed a WebKit bug about this: https://bugs.webkit.org/show_bug.cgi?id=207617

This issue is just for tracking this on the WPT side and linking together various previous issues that could be related or useful:

At this point the issue is blocked on https://bugs.webkit.org/show_bug.cgi?id=207617

@burg

foolip commented 4 years ago

I just tried rebasing/retrying https://github.com/web-platform-tests/wpt/pull/21628 and it still fails for the same reason.

@burg https://bugs.webkit.org/show_bug.cgi?id=207617 was closed as fixed, but maybe that wasn't the problem? Once macOS 10.16 is released and supported on Azure Pipelines, 10.14 will likely be dropped. This has been the pattern in the past. If this issue still affects 10.16, that would make this issue urgent – no Safari runs on wpt.fyi at all. Is there anything we can do to help get to the bottom of this?

burg commented 4 years ago

Where is the invocation of sudo safaridriver --enable ? I can't locate it in the pipeline run, just the part where Safari Technology Preview Webdriver tests are erroring out.

Taking a sysdiagnose after it fails should include logs for safaridriver --enable as well as the preference check when running tests. Can you file a new issue at feedbackassistant.apple.com http://feedbackassistant.apple.com/ and paste the feedback Id here?

On Apr 24, 2020, at 4:23 AM, Philip Jägenstedt notifications@github.com wrote:

I just tried rebasing/retrying #21628 https://github.com/web-platform-tests/wpt/pull/21628 and it still fails for the same reason.

@burg https://github.com/burg https://bugs.webkit.org/show_bug.cgi?id=207617 https://bugs.webkit.org/show_bug.cgi?id=207617 was closed as fixed, but maybe that wasn't the problem? Once macOS 10.16 is released and supported on Azure Pipelines, 10.14 will likely be dropped. This has been the pattern in the past. If this issue still affects 10.16, that would make this issue urgent – no Safari runs on wpt.fyi at all. Is there anything we can do to help get to the bottom of this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/web-platform-tests/wpt/issues/21751#issuecomment-618953979, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAARIGDLQ4N3KWNPHEQENZTROFZCXANCNFSM4KTVSHOA.

stephenmcgruer commented 4 years ago

@burg sorry for the delay here, I missed that you had replied.

I'm not an expert here, but we call sudo "/Applications/Safari Technology Preview.app/Contents/MacOS/safaridriver" --enable in tools/ci/azure/install_safari.yml. This is the 'Install Safari Technology Preview' step of https://dev.azure.com/web-platform-tests/wpt/_build/results?buildId=46833&view=logs&j=c092c851-ddde-53a1-8da2-b0c5715821df&t=d3ae2257-9828-5d99-9d7a-e89be23c7f6e

I will look into getting a sysdiagnose for you.

stephenmcgruer commented 4 years ago

@burg FB7681340 should be the feedbackassistant.apple.com report, which includes the sysdiagnose taken from this run: https://dev.azure.com/web-platform-tests/wpt/_build/results?buildId=47155&view=logs&j=c092c851-ddde-53a1-8da2-b0c5715821df&t=dbea33bd-3dc2-5085-1162-71b50606b5b5

(Note that it looks green, that's because I stuck a || true at the end of the wpt script as a quick hack to get to the next task. The actual logs show Safari still not accepting a remote connection)

burg commented 4 years ago

Great, thanks! I'll take a look when it comes through.

On Apr 29, 2020, at 12:28 PM, Stephen McGruer notifications@github.com wrote:

@burg https://github.com/burg FB7681340 should be the feedbackassistant.apple.com report, which includes the sysdiagnose taken from this run: https://dev.azure.com/web-platform-tests/wpt/_build/results?buildId=47155&view=logs&j=c092c851-ddde-53a1-8da2-b0c5715821df&t=dbea33bd-3dc2-5085-1162-71b50606b5b5 https://dev.azure.com/web-platform-tests/wpt/_build/results?buildId=47155&view=logs&j=c092c851-ddde-53a1-8da2-b0c5715821df&t=dbea33bd-3dc2-5085-1162-71b50606b5b5 (Note that it looks green, that's because I stuck a || true at the end of the wpt script as a quick hack to get to the next task. The actual logs show Safari still not accepting a remote connection)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/web-platform-tests/wpt/issues/21751#issuecomment-621412379, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAARIGC7WLPGTJV6VXQUYMDRPB5VNANCNFSM4KTVSHOA.

stephenmcgruer commented 4 years ago

@burg - any luck in getting access to the sysdiagnose? (I'm not clear how long feedbackassistant reports take to reach you, so sorry if this is too soon a ping :))

burg commented 4 years ago

@stephenmcgruer I got it yesterday, thanks! I'll let you know when I take a look.

foolip commented 4 years ago

Some good news here. Per https://github.com/web-platform-tests/wpt/pull/21628#issuecomment-629845122 it looks like sudo safaridriver --enable for Safari stable does work on macOS Catalina. But for STP 106 it still does not.

This is sort of good news, because the problem should then be fixable in STP. And we won't be stuck on macOS 10.14 for stable runs, which would be a big problem when Azure Pipelines eventually drops support for it.

foolip commented 4 years ago

In https://dev.azure.com/web-platform-tests/wpt/_build/results?buildId=48296 I tried pinning to STP 94 to see if this might have been due a recent regression in STP, but that still failed.

To be clear, I am not looking into this any further, I just wanted to see if https://github.com/web-platform-tests/wpt/pull/23654 had unblocked this, but it hasn't. Hopefully the sysdiagnose @stephenmcgruer submitted will help diagnose this.

Hexcles commented 4 years ago

Does #23654 fix this?

stephenmcgruer commented 4 years ago

Does #23654 fix this?

No; it works only in Safari stable, but not STP 106. My understanding from @burg is that there will be some more logging in ~STP 107 and we'll hopefully learn more about the issue then!

stephenmcgruer commented 4 years ago

@burg STP 107 has now made it to WPT. The issue appears to still reproduce; I just grabbed another sysdiagnose in https://dev.azure.com/web-platform-tests/wpt/_build/results?buildId=48981&view=logs&j=af75b4de-b26f-5d4d-58e8-b41258076d44 and have uploaded it as a comment to feedback report FB7681340 .

Let me know if there's any more information you need from this end :).

stephenmcgruer commented 4 years ago

Ping @burg - have you been able to look at the results from STP 107? STP 108 just hit WPT so I will be attempting a reproduction on that in the next week, but hopefully you can get some more details from STP 107 too.

stephenmcgruer commented 4 years ago

Bringing in some updates from https://github.com/web-platform-tests/wpt/pull/21628:

stephenmcgruer, 12th June 2020:

Unfortunately at this point we're quite stuck on this issue. Folks from Apple have been unable to figure it out even with additional logging:

I am unable to reproduce the Azure issue on a vanilla customer build with Safari Technology Preview 108.
If I had to guess, something weird is going on, such as running safaridriver via a non-UI shell
or weird permissions to ~/Library/
Unfortunately no error is returned on the file write operation, so it's hard to say what it thinks is going on.

We've tried various things to make the plist file appear, but it just won't. Without ssh-access to the bots for interactive debugging, we're rather out of options.

To guard against 'what happens when Catalina becomes the default MacOS', I guess we should experiment and see if we can use a workaround plist file again.

Hexcles, 12th June 2020:

I tried to reproduce it locally and failed, but had some interesting observations.

~/Library/WebDriver seems super special from the perspective of a Linux user. The directory did not exist until I ran sudo "/Applications/Safari Technology Preview.app/Contents/MacOS/safaridriver" --enable, which created the directory that belongs to me (the primary user) instead of root:

ls -ld ~/Library/WebDriver
drwxr-xr-x  3 robertma  primarygroup  96 12 Jun 15:25 /Users/robertma/Library/WebDriver

However, I cannot cd or ls it. I tried a root shell (via sudo -i) and I could cd into it but still couldn't ls. I managed to open it in Finder (Library is hidden by default in Finder) and found com.apple.SafariTechnologyPreview.plist in there, which I was able to open with VSCode:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>AllowRemoteAutomation</key>
  <true/>
</dict>
</plist>

Even with the full path name, I still can't access the file in a terminal at all.

If we suspect something is wrong with this plist, how can we debug it in a script? I don't think our current attempt of ls -l -@ ~/Library/WebDriver is useful if we can't even open it.

Also on June 12th there was an attempt to use the old hacky method of copying a working plist file from another Mac device, but it appeared to still have problems (https://github.com/web-platform-tests/wpt/pull/21628#issuecomment-643479930). This was not fully investigated yet.

burg commented 4 years ago

Hexcles wrote:

~/Library/WebDriver seems super special from the perspective of a Linux user.

It's protected in the same way that file paths for contacts, photos, etc. are more locked down starting in Catalina. This is to mitigate the risk of malware trying to enable WebDriver by writing an magic plist to disk. safaridriver has special sandbox access to ~/Library/WebDriver/. The --enable codepath has been designed to create any files/directories as the real user rather than as root, since that would make it impossible to use safaridriver as non-root.

Without ssh access, I'm not sure what else to try here.

Has anyone spun up a macOS Big Sur beta machine to see if it is also affected?

-Brian

On Jul 1, 2020, at 7:38 AM, Stephen McGruer notifications@github.com wrote:

Bringing in some updates from #21628 https://github.com/web-platform-tests/wpt/pull/21628:

stephenmcgruer, 12th June 2020:

Unfortunately at this point we're quite stuck on this issue. Folks from Apple have been unable to figure it out even with additional logging:

I am unable to reproduce the Azure issue on a vanilla customer build with Safari Technology Preview 108. If I had to guess, something weird is going on, such as running safaridriver via a non-UI shell or weird permissions to ~/Library/ Unfortunately no error is returned on the file write operation, so it's hard to say what it thinks is going on. We've tried various things to make the plist file appear, but it just won't. Without ssh-access to the bots for interactive debugging, we're rather out of options.

To guard against 'what happens when Catalina becomes the default MacOS', I guess we should experiment and see if we can use a workaround plist file again.

Hexcles, 12th June 2020:

I tried to reproduce it locally and failed, but had some interesting observations.

~/Library/WebDriver seems super special from the perspective of a Linux user. The directory did not exist until I ran sudo "/Applications/Safari Technology Preview.app/Contents/MacOS/safaridriver" --enable, which created the directory that belongs to me (the primary user) instead of root:

ls -ld ~/Library/WebDriver drwxr-xr-x 3 robertma primarygroup 96 12 Jun 15:25 /Users/robertma/Library/WebDriver However, I cannot cd or ls it. I tried a root shell (via sudo -i) and I could cd into it but still couldn't ls. I managed to open it in Finder (Library is hidden by default in Finder) and found com.apple.SafariTechnologyPreview.plist in there, which I was able to open with VSCode:

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">

AllowRemoteAutomation

Even with the full path name, I still can't access the file in a terminal at all.

If we suspect something is wrong with this plist, how can we debug it in a script? I don't think our current attempt of ls -l -@ ~/Library/WebDriver is useful if we can't even open it.

Also on June 12th there was an attempt to use the old hacky method of copying a working plist file from another Mac device, but it appeared to still have problems (#21628 (comment) https://github.com/web-platform-tests/wpt/pull/21628#issuecomment-643479930). This was not fully investigated yet.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/web-platform-tests/wpt/issues/21751#issuecomment-652457867, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAARIGB4YNCFVLANJUWSKRDRZNC45ANCNFSM4KTVSHOA.

miketimofeev commented 4 years ago

Hey folks @stephenmcgruer @foolip! We're working on building and delivering AzDO macOS images. How can we help you debug this issue? Unfortunately, we can't provide any access to VMs like ssh yet we're able to run any commands on those VMs and come back with the results. cc @al-cheb @maxim-lobanov

stephenmcgruer commented 4 years ago

Thanks @miketimofeev , appreciate the help! To summarize the problem;

  1. We currently install Safari Tech Preview on Azure Pipelines via a yml task that installs it via homebrew.
  2. Importantly, we then enable Safaridriver by calling sudo "/Applications/Safari Technology Preview.app/Contents/MacOS/safaridriver" --enable - this does all sorts of clever things under the hood, but ultimately is meant to write a plist file to a special location on the drive (~/Library/WebDriver/).

Historically, this has worked fine on MacOS Mojave, but as of Catalina it is not longer working. According to @burg , who is the Apple engineer that maintains Safaridriver, it appears to work from logs we have provided him, but no plist file exists in the end and the functionality of Safaridriver is not enabled.

Now, one thing we have learned very recently is that it doesn't work on other CI systems either, which likely means this is not a specific problem with Azure Pipelines.

The latest question we have from Apple's side is here, where Brian asked:

The only thing I could think of is whether it's a UI session or a remote session (screen sharing / SSH). There are some restrictions associated with a non-UI login, but I did not think that part of it was being unable to read/write restricted files. Can someone confirm if the CI logins are SSH or UI?

If that is a question that you @miketimofeev or one of your team can answer, that would be a great start to hopefully solving this problem!

A second question would be if we know when Big Sur images will become available, as Brian has also asked about that.

miketimofeev commented 4 years ago

@stephenmcgruer basically each VM contains azure-pipelines agent, which receives a signal and start the job so it's a non-interactive session with some level of abstraction as you can see from logs like these(I've tried to install STP with your brew formula) image

And I can confirm 'Allow remote automation' is not enabled after the installation image We've enabled this setting for stable Safari in the clean macOS image that we use for image generation.

About Big Sur, we don't have the plan to add new MacOS for now. Usually, we add new versions of macOS only after stable release.

Hexcles commented 4 years ago

We've enabled this setting for stable Safari in the clean macOS image that we use for image generation.

Hopefully this also works for Safari Technology Preview.

Looking at this, I'm wondering if the "launcher" app (the agent?) needs to be given full disk access permission.

miketimofeev commented 4 years ago

@Hexcles I think it should work, but STP version needs to be preinstalled on the clean image for this. We came across some issues related to the disk access and granted the launcher full access permission.

al-cheb commented 4 years ago

Hello, @stephenmcgruer, @foolip ,@burg I have created a small repository to reproduce the issue - https://github.com/al-cheb/test with the script https://stackoverflow.com/questions/44006628/how-to-enable-allow-remote-automation-in-safari-programmatically which enables checkbox.

    steps:
      - uses: actions/checkout@v2
        with:
          repository: 'al-cheb/test'
      - name: install safari-preview
        run: |
            HOMEBREW_NO_AUTO_UPDATE=1 brew cask install safari-technology-preview.rb
            chmod +x allowremote
            ./allowremote
            sudo "/Applications/Safari Technology Preview.app/Contents/MacOS/safaridriver" --enable
            defaults write com.apple.SafariTechnologyPreview WebKitJavaScriptCanOpenWindowsAutomatically 1
            defaults write com.apple.SafariTechnologyPreview ExperimentalServerTimingEnabled 1
      - name: plist
        run: |
            cat ~/Library/WebDriver/com.apple.SafariTechnologyPreview.plist

image

Could you please provide a simple test to validate from our side?

Hexcles commented 4 years ago

So allowremote is the AppleScript in the SO link?

stephenmcgruer commented 4 years ago

Looks like its https://github.com/al-cheb/test/blob/master/allowremote

Hexcles commented 4 years ago

Alright, same script. @al-cheb it looks like allowremote itself should suffice. Could you try just running allowremote without safaridriver --enable? Also, maybe try running only safaridriver --enable without allowremote to confirm.

al-cheb commented 4 years ago

@Hexcles,

  1. safaridriver --enable without allowremote image

  2. allowremote without safaridriver --enable image

stephenmcgruer commented 4 years ago

So I tried this today for the Catalina upgrade PR (https://github.com/web-platform-tests/wpt/pull/21628/files), and it had trouble launching (I think?) STP: https://dev.azure.com/web-platform-tests/wpt/_build/results?buildId=51229&view=logs&j=af75b4de-b26f-5d4d-58e8-b41258076d44&t=b085271b-ff6b-573d-7913-4d96cb782bdc

========================== Starting Command Output ===========================
/bin/bash --noprofile --norc /Users/runner/work/_temp/24970e4c-5e35-4dd8-a934-5514185ff03f.sh
==> Downloading https://secure-appldnld.apple.com/STP/001-16091-20200610-09f04256-ae36-4930-b7c4-b1333f8d8e5f/SafariTechnologyPreview.dmg
==> Verifying SHA-256 checksum for Cask 'safari-technology-preview'.
Warning: macOS's Gatekeeper has been disabled for this Cask
==> Installing Cask safari-technology-preview
==> Running installer for safari-technology-preview; your password may be necessary.
==> Package installers may write to any location; options such as --appdir are ignored.
installer: Package name is Safari Technology Preview
installer: Installing at base path /
installer: The install was successful.
🍺  safari-technology-preview was successfully installed!
Calling safari-technology-preview-enable-remote.applescript
./tools/ci/azure/safari-technology-preview-enable-remote.applescript:315:351: execution error: System Events got an error: Can’t get toolbar 1 of window 1 of application process "Safari Technology Preview". Invalid index. (-1719)
Does plist exist?
cat: /Users/runner/Library/WebDriver/com.apple.SafariTechnologyPreview.plist: No such file or directory

I haven't done any investigation yet.

al-cheb commented 4 years ago

@stephenmcgruer You should always install the latest Safari Technology Preview otherwise the update popup window prevents running the apple script.

foolip commented 4 years ago

@stephenmcgruer can you try throwing https://github.com/web-platform-tests/wpt/pull/24393 on top of the changes and seeing if that works?

If it does, I guess that's good, but it's not good that this workaround will start failing every time there's a new release of STP which hasn't been merged into WPT yet. If that's the case we might as well not pin STP, since it will break anyway.

stephenmcgruer commented 4 years ago

I see, thanks both. That works, so we now have two workarounds (the other being the known-good-plist) to the fact that safaidriver --enable no longer work on CI systems once upgraded to Catalina (not just Azure Pipelines).

Unfortunately, both workarounds show the same thing: the infrastructure tests are broken on Catalina ([1], [2]):

JavascriptErrorException: javascript error (500): A JavaScript exception occured: window.__wptrunner_process_next_event is not a function. (In 'window.__wptrunner_process_next_event()', 'window.__wptrunner_process_next_event' is undefined)

I'm running a full trigger run currently, to see how this affects the test suite as a whole. Then we will probably want someone with a Catalina Macbook (@Hexcles, @foolip ?) to try and reproduce the above error.

cc @burg and @gsnedders as an FYI about (a) the workarounds we are looking at here due to safaridriver --enable not working on Catalina, and (b) the potential that on top of that we're now looking at a strange problem in Catalina+ that is causing __wptrunner_process_next_event to not exist.

Hexcles commented 4 years ago

You should always install the latest Safari Technology Preview otherwise the update popup window prevents running the apple script.

If that's the case, then the AppleScript workaround might be a bit too finicky; it'd break every time STP gets a new version but we haven't updated ours.

al-cheb commented 4 years ago

You should always install the latest Safari Technology Preview otherwise the update popup window prevents running the apple script.

If that's the case, then the AppleScript workaround might be a bit too finicky; it'd break every time STP gets a new version but we haven't updated ours.

I will take a look at that moment with update window popup.

stephenmcgruer commented 4 years ago

I'm running a full trigger run currently, to see how this affects the test suite as a whole

Full run timed out: https://dev.azure.com/web-platform-tests/wpt/_build/results?buildId=51318&view=logs&j=d41515ed-d71e-5e8d-de88-940c2a934d56&t=ba7296eb-ba60-59d0-62db-869842eecc2d

This almost definitely means something is very broken, because its probably caused by wpt trying to retry tests that are erroring out. Looking at the logs from Azure, definitely seeing many of the :

JavascriptErrorException: javascript error (500): A JavaScript exception occured: window.__wptrunner_process_next_event is not a function. (In 'window.__wptrunner_process_next_event()', 'window.__wptrunner_process_next_event' is undefined)

al-cheb commented 4 years ago

@Hexcles @stephenmcgruer , I have added the statement key code 53 to the script - https://github.com/al-cheb/test/blob/master/allowremote , should work for STP-108 version.

burg commented 4 years ago

Following up on the file write failure... If this is automation being run over SSH or Remote Desktop, how is remote access being enabled on the bots? If this is enabled via System Preferences, then sshd should have full disk access. If it's enabled some other way, then perhaps full disk access needs to be enabled separately in order for safaridriver to write out the plist to disk.

stephenmcgruer commented 4 years ago

Following up on the file write failure... If this is automation being run over SSH or Remote Desktop, how is remote access being enabled on the bots? If this is enabled via System Preferences, then sshd should have full disk access. If it's enabled some other way, then perhaps full disk access needs to be enabled separately in order for safaridriver to write out the plist to disk.

@miketimofeev @al-cheb are either of you able to answer Brian's comment above? (Noting that I believe the 'remote access' he refers to here is remote access to the actual MacOS system itself, not Safari's remote access.

@burg is there some API we can call to detect whether full disk access is enabled?

al-cheb commented 4 years ago

@stephenmcgruer @burg

All scripts are being run via runprovisioner.sh: image

@burg, Could you please provide steps/script to validate from our side?

burg commented 4 years ago

Sorry for dropping off, things have been hectic. We continue to investigate this issue internally.

Based on my testing, the nuclear workaround is to disable SIP (you can check the status with csrutil status). This should avoid file access issues, but I wouldn't recommend it for general purpose computing because security. And it's unlikely that you'd be able to configure a bot for it unless Azure Pipelines already supports that configuration.

stephenmcgruer commented 4 years ago

To keep folks updated here: we managed to update to macOS Catalina using the 'known-good plist' workaround, but as far as I know the bug is still present. @gsnedders and @burg are, I believe, attempting to resolve the underlying issue for Big Sur. (But upgrading to Big Sur is likely quite a long way out, since it depends on Azure Pipelines having Big Sur machines generally available).

foolip commented 3 years ago

Note that while we're still relying on a workaround for Catalina, this isn't the cause of Safari Technology Preview being stuck on an old version. See https://github.com/web-platform-tests/wpt/issues/31147 for why that's happening.

Once we've upgraded to macOS 11 (or later) and no longer need the workaround, this issue can be closed. There's no active work that can be done here, however.