Closed rolfbjarne closed 3 weeks ago
From @vitek-karas on Fri, 13 Sep 2024 11:29:09 GMT
@rolfbjarne maybe you have some context. Should we move this to xamarin-macios?
@Joost-Jens-Luminis it's not clear from your description, but can you reproduce this locally, or does it only happen on CI?
To start diagnosing this, it would be good to get any crash reports from the device: https://github.com/xamarin/xamarin-macios/wiki/Diagnosis#crash-reports
That would reveal what the app is actually doing when it's being terminated.
@rolfbjarne Yes, this can be reproduced locally. A colleague of mine has updated his installation and with his builds (both debug and release builds) the takes to long to start. To be extra clear, the splashscreen becomes visible, but then the app stops responsing and gets closed by the watchdog on iOS after about 19 seconds.
This also means that there is no crash. It just gets closed.
This is a video of what happens. It's boring, but give you an idea https://github.com/user-attachments/assets/98f03e76-f922-4fcb-b64f-078fc3c93134
The first document is the console log from the iOS device regarding what is going on at the time for our App (Bronkhorst FlowcontrolApp). The second one is the full log unfiltered (in cases the filter filters something out that is important) Filtered console log.txt Full console log.txt
I hope this provides enough information to get a clue as to what is going on here.
This also means that there is no crash. It just gets closed.
iOS will still create a crash report. This is from the device log:
ReportCrash Formulating fatal 309 report for corpse[379] Bronkhorst.FlowControl.iOS
You are correct, this is a crash log I got from the phone. I generated it this morning though as I couldn't find the crashlogs from last Friday. But I assume it's the same crash.
Yes, that crash report is for the watchdog:
exhausted real (wall clock) time allowance of 19.76 seconds
Unfortunately all the important frames in the stack trace are only memory addresses:
Thread 0 name: tid_103 Dispatch queue: com.apple.main-thread
Thread 0 Crashed:
0 libsystem_kernel.dylib 0x1bb181484 __psynch_cvwait + 8
1 libsystem_pthread.dylib 0x1dbc09860 _pthread_cond_wait$VARIANT$armv81 + 1224
2 Bronkhorst.FlowControl.iOS 0x104851968 0x104390000 + 4987240
3 Bronkhorst.FlowControl.iOS 0x10491bb5c 0x104390000 + 5815132
4 Bronkhorst.FlowControl.iOS 0x10491ba00 0x104390000 + 5814784
5 Bronkhorst.FlowControl.iOS 0x10493bc90 0x104390000 + 5946512
6 Bronkhorst.FlowControl.iOS 0x1048da590 0x104390000 + 5547408
7 Bronkhorst.FlowControl.iOS 0x104999a6c 0x104390000 + 6330988
8 Bronkhorst.FlowControl.iOS 0x104998404 0x104390000 + 6325252
9 Bronkhorst.FlowControl.iOS 0x10498cddc 0x104390000 + 6278620
10 Bronkhorst.FlowControl.iOS 0x10498a974 0x104390000 + 6269300
11 Bronkhorst.FlowControl.iOS 0x104959ca4 0x104390000 + 6069412
12 Bronkhorst.FlowControl.iOS 0x1049069dc 0x104390000 + 5728732
13 Bronkhorst.FlowControl.iOS 0x1049096fc 0x104390000 + 5740284
14 Bronkhorst.FlowControl.iOS 0x104a45c2c 0x104390000 + 7035948
15 Bronkhorst.FlowControl.iOS 0x104a48988 0x104390000 + 7047560
16 UIKitCore 0x182e6b5d0 -[UIApplication _handleDelegateCallbacksWithOptions:isSuspended:restoreState:] + 336
This is easily fixable, by adding the following to the csproj, rebuilding and triggering a new crash report:
<PropertyGroup>
<NoSymbolStrip>true</NoSymbolStrip>
</PropertyGroup>
Can you do that and upload the new crash report?
Here is the file:
That didn't change anything, the stack trace still only shows memory addresses.
Can you get an MSBuild binlog so that I can investigate to see why this happens?
Maybe that was because it was a release build. I made one with debug, there is a bit more information in here. Crash report from debug build.txt
Unfortunately I spend yesterday trying to update my systems (Windows and Mac) so I can reproduce this while debugging, unfortunately that prevented me from debugging al to gather (VS can't connect to my Mac anymore due to a null reference exception), so this is still a build from the Azure Devops Pipeline but with debugging set instead of release. I hope this is enough.
Yes, that's a bit more helpful at least.
Can you try turning the interpreter off to see if that changes anything?
<PropertyGroup>
<UseInterpreter>false</UseInterpreter>
</PropertyGroup>
if it still fails the same way, please get an updated crash report.
Still doesn't work. So here is a new crash report.
Still doesn't work. So here is a new crash report.
Correct, but the stack trace reveals what the main thread is waiting for:
15 System_Runtime_CompilerServices_TaskAwaiter_GetResult + 120
16 MvvmCross_ViewModels_MvxAppStart_Start_object + 224
17 MvvmCross_Platforms_Ios_Core_MvxApplicationDelegate_RunAppStart_object + 456
18 MvvmCross_Platforms_Ios_Core_MvxApplicationDelegate_FinishedLaunching_UIKit_UIApplication_Foundation_NSDictionary + 524
19 Bronkhorst_FlowControl_iOS_AppDelegate_FinishedLaunching_UIKit_UIApplication_Foundation_NSDictionary + 1240
Your FinishedLaunching method calls MvvmCross' FinishedLaunchingMethod, which runs MvxAppliationDelegate.RunAppStart, which eventually does a "task.GetResult ()", which doesn't return, presumably because the task doesn't complete.
I'm guessing this is the GetResult call in MvvmCross:
and the problem is that the call to NavigateToFirstViewModel never completes:
That's as far as I can get from the crash report, to continue diagnosing I'd probably need a test project I can use to reproduce this myself.
I have made a sample app. The problem is in that path but it is the following line:
StartAsync().GetAwaiter().GetResult();
What "StartAsync" does is less relevant it seems.
I have made a new iOS Application, the default App that Visual studio creates.
The only change I made was in the AppDelegate.cs file. I modified the FinishedLaunching method as follows:
public override bool FinishedLaunching(UIApplication application, NSDictionary launchOptions)
{
...
StartAsync().GetAwaiter().GetResult(); // THIS IS NEWLY ADDED
return true;
}
The StartAsync() method looks like this (Modelled after wat MvvmCross did here):
private int _startHasCommenced;
public async Task StartAsync(object? hint = null)
{
// Check whether Start has commenced, and return if it has
if (Interlocked.CompareExchange(ref _startHasCommenced, 1, 0) == 1)
return;
int applicationHint = await Task.Run(async () =>
{
await Task.Delay(1000);
return 0;
});
await Task.Run(async () =>
{
await Task.Delay(1000);
return applicationHint;
});
}
This seems to be causing the issues. Crash files coming in a next post.
This is the first crash file containing the code mentioned above. iOSApp1-2024-09-19-111807.txt
For sake of testing, I have made a test run with "Interlocked.CompareExchange" commented out to see if that was the issue. This is that log file: iOSApp1-2024-09-19-114321.txt
Edit: Just to clarify, if I don't call StartAsync().GetAwaiter().GetResult(); everything works and I see the Hello iOS message.
And the project itself.
And the project itself.
I can reproduce a deadlock with that project, even with the earlier version of our workload where your main project works, so it's not showing the regression.
oh? That is interesting. Yet the .Net version is the only variable that changed in that time. There not been an MvvmCross update in the mean time either.
And I also see the deadlock now without Appium on my test version (Sorry I didn't test that before hand). So why is that working normally just fine, but not in this test version? And why has that started causing issues normally when running Appium?
Could this be as a result of an SDK change between SDK 8.0.300 and SDK 8.0.400and and not related to Xamarin directly?
oh? That is interesting. Yet the .Net version is the only variable that changed in that time. There not been an MvvmCross update in the mean time either.
And I also see the deadlock now without Appium on my test version (Sorry I didn't test that before hand). So why is that working normally just fine, but not in this test version? And why has that started causing issues normally when running Appium?
Could this be as a result of an SDK change between SDK 8.0.300 and SDK 8.0.400and and not related to Xamarin directly?
It could be a result of many things, but it would just be speculation.
I can do a proper investigation with confirmed results if I could get a test project that reproduces the regression - i.e. it works on the older workload, but not on the newer workload.
Hi @rolfbjarne I just wanted to give an update on this. As I had updated my computer and only have 1 mac I have been unable to get a working setup that shows regression here. What is strange is that the code which seems to be causing the deadlock just doesn't work in my test setup, but it does work in de main application.(assuming the WDA isn't running) in all instances. To me this indicates .Net does some really weird stuff when it comes to async functions and I guess I don't understand this.
I will keep trying to reproduce this using a VM with an older installation, of .Net, but this is a slow process.
I will keep trying to reproduce this using a VM with an older installation, of .Net, but this is a slow process.
Thanks!
It truly sounds like something weird is going on, but that just makes it so much more important for us to actually have a way to reproduce the problem, otherwise we'll just be fumbling in the dark.
I have a small update on this. While attempting to recreate the original situation of our App working while the WebDriverAgent is working by having a Windows VM with only the .net 8.0.303 SDK installed, we noticed that we need the following workload for it to work. Newer workloads fail:
dotnet workload restore --version 17.2.8078
I am still looking for the specific code that causes the deadlock so I can reproduce it.
I have more information. We are running into an issue when using MvvmCross and the WebDriverAgent from Appium. All of a sudden ViewModels are not created anymore on iOS. This causes the App to deadlock during startup. I am trying to pinpoint the cause of the issue but the exception we are getting is as follows:
**System.NullReferenceException:** 'Object reference not set to an instance of an object'
at MvvmCross.ViewModels.MvxViewModelLoader.LoadViewModel(MvxViewModelRequest request, IMvxBundle savedState, IMvxNavigateEventArgs navigationArgs)
at MvvmCross.Platforms.Ios.Views.MvxViewControllerExtensions.LoadViewModel(IMvxIosView iosView)
at MvvmCross.Views.MvxViewExtensions.OnViewCreate(IMvxView view, Func`1 viewModelLoader)
at MvvmCross.Platforms.Ios.Views.MvxViewControllerExtensions.OnViewCreate(IMvxIosView iosView)
at MvvmCross.Platforms.Ios.Views.MvxViewControllerAdapter.HandleViewDidLoadCalled(Object sender, EventArgs e)
at MvvmCross.Base.MvxDelegateExtensions.Raise(EventHandler eventHandler, Object sender)
at MvvmCross.Platforms.Ios.Views.Base.MvxEventSourceViewController.ViewDidLoad()
at MvvmCross.Platforms.Ios.Views.MvxViewController.ViewDidLoad()
at Bronkhorst.FlowControl.iOS.Views.DevicesView.ViewDidLoad() in C:\Work\Flowsuite\Bronkhorst.FlowControl.iOS\Views\DevicesView.cs:line 29
This was working before (on ios workload 17.2.8078) but stopped working. I Have asked MvvmCross developer as to why it's happening but they have no idea either at this time.
THe file it is failing is this: https://github.com/MvvmCross/MvvmCross/blob/develop/MvvmCross/Platforms/Ios/Views/MvxViewControllerExtensions.cs
I think I have found the problem. On the old workload (Workload 17.2.8078) the DevicesView constructor gets called only once.
On the latest workload the DevicesView constructor gets called twice. This also means the ViewDidLoad function gets called twice. One time correctly (it seems) and the second time not.. I think somewhere the workload is trying to load something it shouldn't.
See attached the stack trace of the 3 constructor calls. Old workload is working properly, Call 1 of the new workload seems fine but Call 2 does not. Old workload Constuctor Call.txt New workload Constuctor Call 1.txt New workload Constuctor Call 2.txt
Could the new workload have caused this?
That's really interesting, but I have no idea what might be causing this.
Would you be able to create a test project I can use to reproduce this myself now?
I think so yes, but that will probably by the end of the week. It's end of sprint tomorrow and that has priority :(
I had more time then expected today, so please find here the sample app: Bronkhorst app crash sample.zip
It is probably a bit bigger then you would prefer, as it includes the MvvmCross core library and some plugins as well as a very stripped down version of our app. It does show the problem though so I hope it's enough.
In the "Bronkhorst" folder you would see the iOS app which you can compile and run to show the problem. If you would put a breakpoint in the DevicesView constructor (which can be found in the Views folder) you can see it's called/started twice and if you would let it run you will see the exceptions start appearing and the app is hanging.
We have also found what we believe to be the cause of the issue though: The "Main.storyboard" file has a property called "initialViewController" which is set to the DevicesView (via it's storyboard ID). If we remove that property, it appears to fix the problem we are having. (This hasn't been tested in the full app yet though, only in this sample version).
Exactly which workload versions can you reproduce the failure with? What's the output of dotnet --info
?
You can use the latest workload to see it go wrong.
On iOS Workload 17.2.8078 everything is working as expected.
This works for me, which is the latest as of today:
$ dotnet workload list
Installed Workload Id Manifest Version Installation Source
---------------------------------------------------------------------
maui-tizen 8.0.82/8.0.100 SDK 8.0.300
ios 18.0.8303/8.0.100 SDK 8.0.300
That is interesting. With this it was failing: (This is what I have on my PC)
But you have a newer workload so maybe it's fixed with that?
Even those versions work for me :/
A few more questions:
1) I am only working on a real device 2) Yes 3) I haven't tested that yet. I have always used VS on windows to debug versions. I have used release builds as well.
Do you have the WDA from Appium installed on your test device? Without that installed it works for me as well, but with that installed is when these problems occur.
Do you have the WDA from Appium installed on your test device? Without that installed it works for me as well, but with that installed is when these problems occur.
No, I don't; how do I get that installed?
The following are the instructions I follow when I installed it (They are from our own internal wiki).
A good tutorial can be found here: https://www.youtube.com/watch?v=kvNnF4Zpgmo&t=2s Note: it is a bit out of date as you no longer need to install the WDA from the commandline.
Source: https://www.33rdsquare.com/the-complete-step-by-step-guide-to-appium-ios-test-automation/
Homebrew is a must-have package manager for any Mac device. Open your Mac terminal and install Homebrew:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Next, use Homebrew to install Node.js and the Appium server:
brew install node
npm install -g appium
Appium requires some additional iOS-focused dependencies like ios-deploy and Carthage:
brew install ideviceinstaller
npm install -g ios-deploy
brew install carthage
Install the Appium driver:
appium driver install xcuitest
https://appium.github.io/appium-xcuitest-driver/latest/
You can open the WDA using the following command:
appium driver run xcuitest open-wda
Make sure to set codesign for all packages (else it may not build).
OK, I can reproduce it now, the WDA app must also be running from inside Xcode in order to reproduce.
I've finally figured out what's happening, and it's an unintended side effect of a new feature we added.
The problem is in with the Window
property in the class MvxApplicationDelegate
, and how it's declared new
:
public abstract class MvxApplicationDelegate : UIApplicationDelegate, IMvxApplicationDelegate
{
public new virtual UIWindow Window { get; set; }
// ...
}
Prior to when this broke, this Window
property was not exported to Objective-C (it doesn't have an [Export]
attribute, and doesn't override a property that does either).
In one of our latest updates, we implemented improved support for binding protocols by taking advantage of default interface members in C#. The consequence is that the Window
property is now exported to Objective-C, because:
UIApplicationDelegate
has a Window
property.Window
property is optional, so it's technically not necessary to implement it.MvxApplicationDelegate
class implements the IMvxApplicationDelegate
interface, which implements the IUIApplicationDelegate
interface (which represents the Objective-C protocol UIApplicationDelegate
).MvxApplicationDelegate
's Window
property matches the Window
property on the IUIApplicationDelegate
interface (which has an [Export]
attribute), and export it to Objective-C.Window
property makes iOS take a very different code path internally, and that eventually ends up instantiating the DevicesView
class twice (the storyboard where the DevicesView
class is used is instantiated twice).Sidenote: it's possible to get replicate the broken behavior with the older version of the workload by making the Window
property override the Window
property from UIApplicationDelegate
(in this case the property is exported to Objective-C, because UIApplication.Window
has an [Export]
attribute):
public abstract class MvxApplicationDelegate : UIApplicationDelegate, IMvxApplicationDelegate
{
public override UIWindow Window { get; set; }
// ...
}
Going forward, I don't think this is something we'll fix/change on our side, because it's such an incredible rare set of circumstances that leads to the eventual deadlock at startup.
I believe the best fix is on your end, by making the DevicesView
class not fail when it's instantiated more than once.
Alternatively it might be possible to just rename the Window
property, to say MainWindow
:
public abstract class MvxApplicationDelegate : UIApplicationDelegate, IMvxApplicationDelegate
{
public UIWindow MainWindow { get; set; }
/* Update all the Window references in this class to use MainWindow instead as well! */
}
I'll close this now, but feel free to comment if you disagree or have any further problems, and I'll see what I can do.
FWIW this was one of the most interesting issues I've had to diagnose in a while, so thanks a lot for taking the time to create a test project I could use to reproduce it!
Wow, thanks for the explanation. I am happy you have figured out what went wrong (and that you enjoyed it :D ).
One final question though, is our work around fine? Or will that cause issues in the future?
The "Main.storyboard" file has a property called "initialViewController" which is set to the DevicesView (via it's storyboard ID).
If we remove that property, it appears to fix the problem we are having.
edit: spelling
Wow, thanks for the explanation. I am happy you have figured out what went wrong (and that you enjoyed it :D ).
One final question though, is our work around fine? Or will that cause issues in the future?
The "Main.storyboard" file has a property called "initialViewController" which is set to the DevicesView (via it's storyboard ID). If we remove that property, it appears to fix the problem we are having.
edit: spelling
Yes, that should work fine too I believe.
In that case I consider this issue resolved, the rest lies with the MvvmCross people whom I have informed. Thank you so much for your hard work on this, it's much appreciated.
From @Joost-Jens-Luminis on Fri, 13 Sep 2024 07:26:42 GMT
We have been creating tests using Appium for a few months now for our app on Windows, Android and iOS. Unfortunately, with the release of .Net 8.0.401 and the released iOS workload it broke our ability to test on iOS.
When the WebDriverAgent from Appium is running, it takes to long for the App to start now and it gets killed by the iOS watchdog. If the WebDriverAgent from Appium is not running, the app is starting up fine and working properly.
With .Net version 8.0.300 (and related workloads) it is still working properly. Unfortunately, as we are using a release version, build in Azure Pipelines, to run the tests we cannot downgrade to an earlier version of the workloads.
The following installation is still working fine with Appium testing on iOS (This is from my PC):
Is there any information on what has changed in net8.0-ios that could have caused this? And how we could work around this issue?
Note: We are a 100% sure Appium didn't cause this, as we hadn't update Appium when the problems started. Of course we did try updating Appium and it's drivers to the latest version but to no avail.
Copied from original issue dotnet/runtime#107781