xamarin / xamarin-macios

.NET for iOS, Mac Catalyst, macOS, and tvOS provide open-source bindings of the Apple SDKs for use with .NET managed languages such as C#
Other
2.47k stars 514 forks source link

Version 8.0.401 release broke ability to run app with WebDriverAgent from Appium on iOS #21234

Closed rolfbjarne closed 3 weeks ago

rolfbjarne commented 1 month ago

From @Joost-Jens-Luminis on Fri, 13 Sep 2024 07:26:42 GMT

We have been creating tests using Appium for a few months now for our app on Windows, Android and iOS. Unfortunately, with the release of .Net 8.0.401 and the released iOS workload it broke our ability to test on iOS.

When the WebDriverAgent from Appium is running, it takes to long for the App to start now and it gets killed by the iOS watchdog. If the WebDriverAgent from Appium is not running, the app is starting up fine and working properly.

With .Net version 8.0.300 (and related workloads) it is still working properly. Unfortunately, as we are using a release version, build in Azure Pipelines, to run the tests we cannot downgrade to an earlier version of the workloads.

The following installation is still working fine with Appium testing on iOS (This is from my PC): image

Is there any information on what has changed in net8.0-ios that could have caused this? And how we could work around this issue?

Note: We are a 100% sure Appium didn't cause this, as we hadn't update Appium when the problems started. Of course we did try updating Appium and it's drivers to the latest version but to no avail.

Copied from original issue dotnet/runtime#107781

rolfbjarne commented 1 month ago

From @vitek-karas on Fri, 13 Sep 2024 11:29:09 GMT

@rolfbjarne maybe you have some context. Should we move this to xamarin-macios?

rolfbjarne commented 1 month ago

@Joost-Jens-Luminis it's not clear from your description, but can you reproduce this locally, or does it only happen on CI?

To start diagnosing this, it would be good to get any crash reports from the device: https://github.com/xamarin/xamarin-macios/wiki/Diagnosis#crash-reports

That would reveal what the app is actually doing when it's being terminated.

Joost-Jens-Luminis commented 1 month ago

@rolfbjarne Yes, this can be reproduced locally. A colleague of mine has updated his installation and with his builds (both debug and release builds) the takes to long to start. To be extra clear, the splashscreen becomes visible, but then the app stops responsing and gets closed by the watchdog on iOS after about 19 seconds.

This also means that there is no crash. It just gets closed.

Joost-Jens-Luminis commented 1 month ago

This is a video of what happens. It's boring, but give you an idea https://github.com/user-attachments/assets/98f03e76-f922-4fcb-b64f-078fc3c93134

The first document is the console log from the iOS device regarding what is going on at the time for our App (Bronkhorst FlowcontrolApp). The second one is the full log unfiltered (in cases the filter filters something out that is important) Filtered console log.txt Full console log.txt

I hope this provides enough information to get a clue as to what is going on here.

rolfbjarne commented 1 month ago

This also means that there is no crash. It just gets closed.

iOS will still create a crash report. This is from the device log:

ReportCrash Formulating fatal 309 report for corpse[379] Bronkhorst.FlowControl.iOS

Joost-Jens-Luminis commented 1 month ago

You are correct, this is a crash log I got from the phone. I generated it this morning though as I couldn't find the crashlogs from last Friday. But I assume it's the same crash.

Crash log.txt

rolfbjarne commented 1 month ago

Yes, that crash report is for the watchdog:

exhausted real (wall clock) time allowance of 19.76 seconds

Unfortunately all the important frames in the stack trace are only memory addresses:

Thread 0 name:  tid_103 Dispatch queue: com.apple.main-thread
Thread 0 Crashed:
0   libsystem_kernel.dylib                 0x1bb181484 __psynch_cvwait + 8
1   libsystem_pthread.dylib                0x1dbc09860 _pthread_cond_wait$VARIANT$armv81 + 1224
2   Bronkhorst.FlowControl.iOS             0x104851968 0x104390000 + 4987240
3   Bronkhorst.FlowControl.iOS             0x10491bb5c 0x104390000 + 5815132
4   Bronkhorst.FlowControl.iOS             0x10491ba00 0x104390000 + 5814784
5   Bronkhorst.FlowControl.iOS             0x10493bc90 0x104390000 + 5946512
6   Bronkhorst.FlowControl.iOS             0x1048da590 0x104390000 + 5547408
7   Bronkhorst.FlowControl.iOS             0x104999a6c 0x104390000 + 6330988
8   Bronkhorst.FlowControl.iOS             0x104998404 0x104390000 + 6325252
9   Bronkhorst.FlowControl.iOS             0x10498cddc 0x104390000 + 6278620
10  Bronkhorst.FlowControl.iOS             0x10498a974 0x104390000 + 6269300
11  Bronkhorst.FlowControl.iOS             0x104959ca4 0x104390000 + 6069412
12  Bronkhorst.FlowControl.iOS             0x1049069dc 0x104390000 + 5728732
13  Bronkhorst.FlowControl.iOS             0x1049096fc 0x104390000 + 5740284
14  Bronkhorst.FlowControl.iOS             0x104a45c2c 0x104390000 + 7035948
15  Bronkhorst.FlowControl.iOS             0x104a48988 0x104390000 + 7047560
16  UIKitCore                              0x182e6b5d0 -[UIApplication _handleDelegateCallbacksWithOptions:isSuspended:restoreState:] + 336

This is easily fixable, by adding the following to the csproj, rebuilding and triggering a new crash report:

<PropertyGroup>
    <NoSymbolStrip>true</NoSymbolStrip>
</PropertyGroup>

Can you do that and upload the new crash report?

Joost-Jens-Luminis commented 1 month ago

Here is the file:

Bronkhorst.FlowControl.iOS-2024-09-17-084529.txt

rolfbjarne commented 1 month ago

That didn't change anything, the stack trace still only shows memory addresses.

Can you get an MSBuild binlog so that I can investigate to see why this happens?

Joost-Jens-Luminis commented 1 month ago

Maybe that was because it was a release build. I made one with debug, there is a bit more information in here. Crash report from debug build.txt

Unfortunately I spend yesterday trying to update my systems (Windows and Mac) so I can reproduce this while debugging, unfortunately that prevented me from debugging al to gather (VS can't connect to my Mac anymore due to a null reference exception), so this is still a build from the Azure Devops Pipeline but with debugging set instead of release. I hope this is enough.

rolfbjarne commented 1 month ago

Yes, that's a bit more helpful at least.

Can you try turning the interpreter off to see if that changes anything?

<PropertyGroup>
    <UseInterpreter>false</UseInterpreter>
</PropertyGroup>

if it still fails the same way, please get an updated crash report.

Joost-Jens-Luminis commented 1 month ago

Still doesn't work. So here is a new crash report.

Bronkhorst.FlowControl.iOS-2024-09-18-111358.txt

rolfbjarne commented 1 month ago

Still doesn't work. So here is a new crash report.

Bronkhorst.FlowControl.iOS-2024-09-18-111358.txt

Correct, but the stack trace reveals what the main thread is waiting for:

15  System_Runtime_CompilerServices_TaskAwaiter_GetResult + 120
16  MvvmCross_ViewModels_MvxAppStart_Start_object + 224
17  MvvmCross_Platforms_Ios_Core_MvxApplicationDelegate_RunAppStart_object + 456
18  MvvmCross_Platforms_Ios_Core_MvxApplicationDelegate_FinishedLaunching_UIKit_UIApplication_Foundation_NSDictionary + 524
19  Bronkhorst_FlowControl_iOS_AppDelegate_FinishedLaunching_UIKit_UIApplication_Foundation_NSDictionary + 1240

Your FinishedLaunching method calls MvvmCross' FinishedLaunchingMethod, which runs MvxAppliationDelegate.RunAppStart, which eventually does a "task.GetResult ()", which doesn't return, presumably because the task doesn't complete.

I'm guessing this is the GetResult call in MvvmCross:

https://github.com/MvvmCross/MvvmCross/blob/9f39b77f60255c57a0c542bbe36b3d2237980973/MvvmCross/ViewModels/MvxAppStart.cs#L30

and the problem is that the call to NavigateToFirstViewModel never completes:

https://github.com/MvvmCross/MvvmCross/blob/9f39b77f60255c57a0c542bbe36b3d2237980973/MvvmCross/ViewModels/MvxAppStart.cs#L45

That's as far as I can get from the crash report, to continue diagnosing I'd probably need a test project I can use to reproduce this myself.

Joost-Jens-Luminis commented 1 month ago

I have made a sample app. The problem is in that path but it is the following line:

StartAsync().GetAwaiter().GetResult();

What "StartAsync" does is less relevant it seems.

I have made a new iOS Application, the default App that Visual studio creates. image

The only change I made was in the AppDelegate.cs file. I modified the FinishedLaunching method as follows:

        public override bool FinishedLaunching(UIApplication application, NSDictionary launchOptions)
        {
            ...
            StartAsync().GetAwaiter().GetResult(); // THIS IS NEWLY ADDED

            return true;
        }

The StartAsync() method looks like this (Modelled after wat MvvmCross did here):

        private int _startHasCommenced;
        public async Task StartAsync(object? hint = null)
        {
            // Check whether Start has commenced, and return if it has
            if (Interlocked.CompareExchange(ref _startHasCommenced, 1, 0) == 1)
                return;

            int applicationHint = await Task.Run(async () =>
            {
                await Task.Delay(1000);
                return 0;
            });

            await Task.Run(async () =>
            {
                await Task.Delay(1000);
                return applicationHint;
            });
        }

This seems to be causing the issues. Crash files coming in a next post.

Joost-Jens-Luminis commented 1 month ago

This is the first crash file containing the code mentioned above. iOSApp1-2024-09-19-111807.txt

For sake of testing, I have made a test run with "Interlocked.CompareExchange" commented out to see if that was the issue. This is that log file: iOSApp1-2024-09-19-114321.txt

Edit: Just to clarify, if I don't call StartAsync().GetAwaiter().GetResult(); everything works and I see the Hello iOS message.

Joost-Jens-Luminis commented 1 month ago

iOSApp1.zip

And the project itself.

rolfbjarne commented 1 month ago

iOSApp1.zip

And the project itself.

I can reproduce a deadlock with that project, even with the earlier version of our workload where your main project works, so it's not showing the regression.

Joost-Jens-Luminis commented 1 month ago

oh? That is interesting. Yet the .Net version is the only variable that changed in that time. There not been an MvvmCross update in the mean time either.

And I also see the deadlock now without Appium on my test version (Sorry I didn't test that before hand). So why is that working normally just fine, but not in this test version? And why has that started causing issues normally when running Appium?

Could this be as a result of an SDK change between SDK 8.0.300 and SDK 8.0.400and and not related to Xamarin directly?

rolfbjarne commented 1 month ago

oh? That is interesting. Yet the .Net version is the only variable that changed in that time. There not been an MvvmCross update in the mean time either.

And I also see the deadlock now without Appium on my test version (Sorry I didn't test that before hand). So why is that working normally just fine, but not in this test version? And why has that started causing issues normally when running Appium?

Could this be as a result of an SDK change between SDK 8.0.300 and SDK 8.0.400and and not related to Xamarin directly?

It could be a result of many things, but it would just be speculation.

I can do a proper investigation with confirmed results if I could get a test project that reproduces the regression - i.e. it works on the older workload, but not on the newer workload.

Joost-Jens-Luminis commented 1 month ago

Hi @rolfbjarne I just wanted to give an update on this. As I had updated my computer and only have 1 mac I have been unable to get a working setup that shows regression here. What is strange is that the code which seems to be causing the deadlock just doesn't work in my test setup, but it does work in de main application.(assuming the WDA isn't running) in all instances. To me this indicates .Net does some really weird stuff when it comes to async functions and I guess I don't understand this.

I will keep trying to reproduce this using a VM with an older installation, of .Net, but this is a slow process.

rolfbjarne commented 1 month ago

I will keep trying to reproduce this using a VM with an older installation, of .Net, but this is a slow process.

Thanks!

It truly sounds like something weird is going on, but that just makes it so much more important for us to actually have a way to reproduce the problem, otherwise we'll just be fumbling in the dark.

Joost-Jens-Luminis commented 1 month ago

I have a small update on this. While attempting to recreate the original situation of our App working while the WebDriverAgent is working by having a Windows VM with only the .net 8.0.303 SDK installed, we noticed that we need the following workload for it to work. Newer workloads fail: dotnet workload restore --version 17.2.8078

I am still looking for the specific code that causes the deadlock so I can reproduce it.

Joost-Jens-Luminis commented 1 month ago

I have more information. We are running into an issue when using MvvmCross and the WebDriverAgent from Appium. All of a sudden ViewModels are not created anymore on iOS. This causes the App to deadlock during startup. I am trying to pinpoint the cause of the issue but the exception we are getting is as follows:

**System.NullReferenceException:** 'Object reference not set to an instance of an object'

   at MvvmCross.ViewModels.MvxViewModelLoader.LoadViewModel(MvxViewModelRequest request, IMvxBundle savedState, IMvxNavigateEventArgs navigationArgs)
   at MvvmCross.Platforms.Ios.Views.MvxViewControllerExtensions.LoadViewModel(IMvxIosView iosView)
   at MvvmCross.Views.MvxViewExtensions.OnViewCreate(IMvxView view, Func`1 viewModelLoader)
   at MvvmCross.Platforms.Ios.Views.MvxViewControllerExtensions.OnViewCreate(IMvxIosView iosView)
   at MvvmCross.Platforms.Ios.Views.MvxViewControllerAdapter.HandleViewDidLoadCalled(Object sender, EventArgs e)
   at MvvmCross.Base.MvxDelegateExtensions.Raise(EventHandler eventHandler, Object sender)
   at MvvmCross.Platforms.Ios.Views.Base.MvxEventSourceViewController.ViewDidLoad()
   at MvvmCross.Platforms.Ios.Views.MvxViewController.ViewDidLoad()
   at Bronkhorst.FlowControl.iOS.Views.DevicesView.ViewDidLoad() in C:\Work\Flowsuite\Bronkhorst.FlowControl.iOS\Views\DevicesView.cs:line 29

This was working before (on ios workload 17.2.8078) but stopped working. I Have asked MvvmCross developer as to why it's happening but they have no idea either at this time.

THe file it is failing is this: https://github.com/MvvmCross/MvvmCross/blob/develop/MvvmCross/Platforms/Ios/Views/MvxViewControllerExtensions.cs

Joost-Jens-Luminis commented 1 month ago

I think I have found the problem. On the old workload (Workload 17.2.8078) the DevicesView constructor gets called only once.

On the latest workload the DevicesView constructor gets called twice. This also means the ViewDidLoad function gets called twice. One time correctly (it seems) and the second time not.. I think somewhere the workload is trying to load something it shouldn't.

See attached the stack trace of the 3 constructor calls. Old workload is working properly, Call 1 of the new workload seems fine but Call 2 does not. Old workload Constuctor Call.txt New workload Constuctor Call 1.txt New workload Constuctor Call 2.txt

Could the new workload have caused this?

rolfbjarne commented 1 month ago

That's really interesting, but I have no idea what might be causing this.

Would you be able to create a test project I can use to reproduce this myself now?

Joost-Jens-Luminis commented 1 month ago

I think so yes, but that will probably by the end of the week. It's end of sprint tomorrow and that has priority :(

Joost-Jens-Luminis commented 1 month ago

I had more time then expected today, so please find here the sample app: Bronkhorst app crash sample.zip

It is probably a bit bigger then you would prefer, as it includes the MvvmCross core library and some plugins as well as a very stripped down version of our app. It does show the problem though so I hope it's enough.

In the "Bronkhorst" folder you would see the iOS app which you can compile and run to show the problem. If you would put a breakpoint in the DevicesView constructor (which can be found in the Views folder) you can see it's called/started twice and if you would let it run you will see the exceptions start appearing and the app is hanging.

We have also found what we believe to be the cause of the issue though: The "Main.storyboard" file has a property called "initialViewController" which is set to the DevicesView (via it's storyboard ID). If we remove that property, it appears to fix the problem we are having. (This hasn't been tested in the full app yet though, only in this sample version).

rolfbjarne commented 1 month ago

Exactly which workload versions can you reproduce the failure with? What's the output of dotnet --info?

Joost-Jens-Luminis commented 1 month ago

You can use the latest workload to see it go wrong.

On iOS Workload 17.2.8078 everything is working as expected.

rolfbjarne commented 1 month ago

This works for me, which is the latest as of today:

$ dotnet workload list

Installed Workload Id      Manifest Version       Installation Source
---------------------------------------------------------------------
maui-tizen                 8.0.82/8.0.100         SDK 8.0.300
ios                        18.0.8303/8.0.100      SDK 8.0.300
Joost-Jens-Luminis commented 1 month ago

That is interesting. With this it was failing: (This is what I have on my PC) image

But you have a newer workload so maybe it's fixed with that?

rolfbjarne commented 1 month ago

Even those versions work for me :/

A few more questions:

Joost-Jens-Luminis commented 1 month ago

1) I am only working on a real device 2) Yes 3) I haven't tested that yet. I have always used VS on windows to debug versions. I have used release builds as well.

Do you have the WDA from Appium installed on your test device? Without that installed it works for me as well, but with that installed is when these problems occur.

rolfbjarne commented 1 month ago

Do you have the WDA from Appium installed on your test device? Without that installed it works for me as well, but with that installed is when these problems occur.

No, I don't; how do I get that installed?

Joost-Jens-Luminis commented 1 month ago

The following are the instructions I follow when I installed it (They are from our own internal wiki).

A good tutorial can be found here: https://www.youtube.com/watch?v=kvNnF4Zpgmo&t=2s Note: it is a bit out of date as you no longer need to install the WDA from the commandline.

Source: https://www.33rdsquare.com/the-complete-step-by-step-guide-to-appium-ios-test-automation/

https://appium.github.io/appium-xcuitest-driver/latest/

You can open the WDA using the following command:

appium driver run xcuitest open-wda

Make sure to set codesign for all packages (else it may not build).

rolfbjarne commented 4 weeks ago

OK, I can reproduce it now, the WDA app must also be running from inside Xcode in order to reproduce.

rolfbjarne commented 3 weeks ago

I've finally figured out what's happening, and it's an unintended side effect of a new feature we added.

The problem is in with the Window property in the class MvxApplicationDelegate, and how it's declared new:

public abstract class MvxApplicationDelegate : UIApplicationDelegate, IMvxApplicationDelegate
{
    public new virtual UIWindow Window { get; set; }
    // ...
}

Prior to when this broke, this Window property was not exported to Objective-C (it doesn't have an [Export] attribute, and doesn't override a property that does either).

In one of our latest updates, we implemented improved support for binding protocols by taking advantage of default interface members in C#. The consequence is that the Window property is now exported to Objective-C, because:

  1. The Objective-C protocol UIApplicationDelegate has a Window property.
  2. This Window property is optional, so it's technically not necessary to implement it.
  3. The MvxApplicationDelegate class implements the IMvxApplicationDelegate interface, which implements the IUIApplicationDelegate interface (which represents the Objective-C protocol UIApplicationDelegate).
  4. We see that MvxApplicationDelegate's Window property matches the Window property on the IUIApplicationDelegate interface (which has an [Export] attribute), and export it to Objective-C.
  5. The presence in Objective-C of the Window property makes iOS take a very different code path internally, and that eventually ends up instantiating the DevicesView class twice (the storyboard where the DevicesView class is used is instantiated twice).

Sidenote: it's possible to get replicate the broken behavior with the older version of the workload by making the Window property override the Window property from UIApplicationDelegate (in this case the property is exported to Objective-C, because UIApplication.Window has an [Export] attribute):

public abstract class MvxApplicationDelegate : UIApplicationDelegate, IMvxApplicationDelegate
{
    public override UIWindow Window { get; set; }
    // ...
}

Going forward, I don't think this is something we'll fix/change on our side, because it's such an incredible rare set of circumstances that leads to the eventual deadlock at startup.

I believe the best fix is on your end, by making the DevicesView class not fail when it's instantiated more than once.

Alternatively it might be possible to just rename the Window property, to say MainWindow:

public abstract class MvxApplicationDelegate : UIApplicationDelegate, IMvxApplicationDelegate
{
    public UIWindow MainWindow { get; set; }
    /* Update all the Window references in this class to use MainWindow instead as well! */
}

I'll close this now, but feel free to comment if you disagree or have any further problems, and I'll see what I can do.

FWIW this was one of the most interesting issues I've had to diagnose in a while, so thanks a lot for taking the time to create a test project I could use to reproduce it!

Joost-Jens-Luminis commented 3 weeks ago

Wow, thanks for the explanation. I am happy you have figured out what went wrong (and that you enjoyed it :D ).

One final question though, is our work around fine? Or will that cause issues in the future?

The "Main.storyboard" file has a property called "initialViewController" which is set to the DevicesView (via it's storyboard ID).
 If we remove that property, it appears to fix the problem we are having. 

edit: spelling

rolfbjarne commented 3 weeks ago

Wow, thanks for the explanation. I am happy you have figured out what went wrong (and that you enjoyed it :D ).

One final question though, is our work around fine? Or will that cause issues in the future?

The "Main.storyboard" file has a property called "initialViewController" which is set to the DevicesView (via it's storyboard ID).
 If we remove that property, it appears to fix the problem we are having. 

edit: spelling

Yes, that should work fine too I believe.

Joost-Jens-Luminis commented 3 weeks ago

In that case I consider this issue resolved, the rest lies with the MvvmCross people whom I have informed. Thank you so much for your hard work on this, it's much appreciated.