Closed tmijieux closed 2 years ago
I think i found a workaround for the place i was stuck for using xabuild, for a reason that i did not understand yet, NuGetTargets has its value set, as if VisualStudioVersion was 15.0 and the import is skipped, although VisualStudioVersion seems to be 17.0 in everywhere i look at it in my binlog.
(this is file from visual studio, imported through the Current
symlink)
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<PropertyGroup>
<NuGetTargets Condition="'$(NuGetTargets)'==''">$(MSBuildExtensionsPath)\Microsoft\NuGet\$(VisualStudioVersion)\Microsoft.NuGet.targets</NuGetTargets>
</PropertyGroup>
<Import Condition="Exists('$(NuGetTargets)') and '$(SkipImportNuGetBuildTargets)' != 'true'" Project="$(NuGetTargets)" />
</Project>
if i set NuGetTargets to the right value before any import , the packagereference now works correctly
I managed to get the application to crash in the debugger but as i did not yet compile my own mono, so I dont have the source or enough detailed debugging information stored in the shared objects. I cannot print any values or go up or down the stack yet 😢 . We still have maybe a little more information than the previous stack dump though...
-exec bt
#0 0x00007c72ab4f02a8 in syscall () from C:\Users\Thomas\AppData\Local\Temp\x64\Debug\.gdb\libc.so
#1 0x00007c72ab4f3213 in abort () from C:\Users\Thomas\AppData\Local\Temp\x64\Debug\.gdb\libc.so
#2 0x00007c6fbf23174a in ?? ()
#3 0x00007c6fbf1d9cf0 in ?? ()
#4 0x00007c6fbfb0f154 in eglib_log_adapter (log_domain=0x0, log_level=G_LOG_LEVEL_ERROR, message=0x7c70c92a0e90 "* Assertion: should not be reached at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/sgen/sgen-scan-object.h:91\n", user_data=0x0) at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/utils/mono-logger.c:405
#5 0x00007c6fbfb3c72a in monoeg_g_logstr (log_domain=0x0, log_level=G_LOG_LEVEL_ERROR, msg=0x7c70c92a0e90 "* Assertion: should not be reached at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/sgen/sgen-scan-object.h:91\n") at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/eglib/goutput.c:151
#6 0x00007c6fbfb3c00e in monoeg_g_logv_nofree (log_domain=0x0, log_level=G_LOG_LEVEL_ERROR, format=0x7c6fbfb96dca "* Assertion: should not be reached at %s:%d\n", args=0x7c6fbf1d9020) at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/eglib/goutput.c:166
#7 0x00007c6fbfb3c2f4 in monoeg_assertion_message (format=0x7c6fbfb96dca "* Assertion: should not be reached at %s:%d\n") at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/eglib/goutput.c:201
#8 0x00007c6fbfb3c394 in mono_assertion_message_unreachable (file=0x7c6fbfb8bb6a "/Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/sgen/sgen-scan-object.h", line=91) at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/eglib/goutput.c:228
#9 0x00007c6fbfaa0c77 in major_scan_object_concurrent_with_evacuation (full_object=0x7c6fb7a157e0, desc=35027181184716800, queue=0x7c72aac6b010) at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/sgen/sgen-scan-object.h:91
#10 0x00007c6fbfac3dd1 in scan_card_table_for_block (block=0x7c6fb7a14000, scan_type=CARDTABLE_SCAN_MOD_UNION_PRECLEAN, ctx=...) at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/sgen/sgen-marksweep.c:2619
#11 0x00007c6fbfa91e24 in major_scan_card_table (scan_type=CARDTABLE_SCAN_MOD_UNION_PRECLEAN, ctx=..., job_index=0, job_split_count=1, block_count=4973) at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/sgen/sgen-marksweep.c:2711
#12 0x00007c6fbfa877d2 in job_major_mod_union_preclean (worker_data_untyped=0x7c72aac6b008, job=0x7c7001dc3208) at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/sgen/sgen-gc.c:1554
#13 0x00007c6fbfb00947 in thread_func (data=0x0) at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/sgen/sgen-thread-pool.c:207
#14 0x00007c72ab55dd2b in __pthread_start(void*) () from C:\Users\Thomas\AppData\Local\Temp\x64\Debug\.gdb\libc.so
#15 0x00007c72ab4f50c8 in __start_thread () from C:\Users\Thomas\AppData\Local\Temp\x64\Debug\.gdb\libc.so
-exec up
#14 0x00007c72ab55dd2b in __pthread_start(void*) () from C:\Users\Thomas\AppData\Local\Temp\x64\Debug\.gdb\libc.so
=thread-selected,id="22",frame={level="14",addr="0x00007c72ab55dd2b",func="__pthread_start(void*)",args=[],from="C:\\Users\\Thomas\\AppData\\Local\\Temp\\x64\\Debug\\.gdb\\libc.so",arch="i386:x86-64"}
-exec down
#12 0x00007c6fbfa877d2 in job_major_mod_union_preclean (worker_data_untyped=0x7c72aac6b008, job=0x7c7001dc3208) at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/sgen/sgen-gc.c:1554
1554 in /Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/sgen/sgen-gc.c
=thread-selected,id="22",frame={level="12",addr="0x00007c6fbfa877d2",func="job_major_mod_union_preclean",args=[{name="worker_data_untyped",value="0x7c72aac6b008"},{name="job",value="0x7c7001dc3208"}],file="/Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/sgen/sgen-gc.c",fullname="/Users/builder/jenkins/workspace/archive-mono/2020-02/android/debug/mono/sgen/sgen-gc.c",line="1554",arch="i386:x86-64"}
(limited to frame 12 and 14)
Using the "new" bridge implementation seems to be a valid workaround for now.
I am still worried that some of the code is doing some memory corruption and changing the implementation is just preventing crashes because of arbitrary implementation luck. But if all features of my app seems to work correctly, and it is not crashing anymore, then it it still better than nothing :sweat_smile:. For the wellbeing of my app, I wish that there is a bug in the "tarjan" bridge implementation and changing it to "new" definitely fix my bug, but if that is the case then is probably not a good news for you ... :fearful:
I managed to get more information.
I put the source of mono/sgen in C:\Users\builder\jenkins\workspace\archive-mono\2020-02\android\debug\mono\sgen
and setting most library from C:/src/xamarin-android/bin/Debug/lib/xamarin.android/xbuild/Xamarin/Android/lib/x86_64/
in visual studio 'Additional Symbol Search Paths' by renaming the .so
files to match the one in the apk allowed me to get the following informations:
-exec p *full_object->vtable
$7 = {klass = 0x1a00, gc_descr = 32342887324176384, domain = 0x72e66551eda000, type = 0x72e83fc77d0000, interface_bitmap = 0x20100000000cd00 <error: Cannot access memory at address 0x20100000000cd00>, max_interface_id = 0, rank = 0 '\000', initialized = 0 '\000', flags = 0 '\000', remote = 0, init_failed = 0, has_static_fields = 0, gc_bits = 0, imt_collisions_bitmap = 0, runtime_generic_context = 0x0, interp_vtable = 0x41da500000, vtable = 0x72e83fc7e42f}
It seems the three lower bits are consistently set to zero during the few times i was able to reproduce. Maybe the values can give ideas to people familiar with gc implementation what could be causing this. Also lots of variable that are most probably supposed to be pointer does not seems to point to valid memory. So maybe this is definitely not a regular gc object that the gc is currently looking at...
~~Unless someone want to get to the bottom of this and needs my help for reproducing or to get more info (i will glady help), or if the bug reappear i will probably not update this issue anymore. It still needs to go through quality check yet and test on many more devices, so this solution is not yet completely accepted on my side, but if it is then, i will close this issue.~~
EDIT: I am rather eager to undestand what is going on to be sure this bug will not show up again uninvited,
so I read a little bit of code and documentation about sgen and I saw that if the 3 lower bits on vtable address are actually flags for cemented/pinned/forwarded state in the gc (and in my screenshot it is the case ...) and i also saw some bits of code like this in the source:
/* We untag the vtable for concurrent M&S, in case bridge is running and it tagged it */
desc = sgen_vtable_get_descriptor ((GCVTable)SGEN_POINTER_UNTAG_VTABLE (vtable_word));
so i am wondering if there could be a rare race condition happening with tarjan implementation that could re-tag object after this line of code, and then scan works with a tagged object where it does not expect it to be ? (in this case the memory the gc is looking would just be shifted by 7 bytes which could explain the issue...)
As i suspected if i just clear out the 3 lower bits of the vtable address in the debugger I get valid objects everytime (but never the same type of object).
I have made substantial progress in identifying and reproducing the issue, that i reported here:
the bug affects xamarin-android in its default configuration but it is specific for mono sgen and the bug is also a rather corner-case, most probably unlikely race-condition so I don't know if any action is required to be taken here or not ? (except continue to integrate upstream mono bugfixes when they are released). ( changing the default implementation for the bridge is probably a bad idea because for me it created noticeable performance degration on some android device and i had to tweak some other gc parameters to get it to an acceptable level of performance)
@tmijieux thanks for a very thorough job investigating the issue! However, Xamarin.Android is but a "client" of the Mono runtime, so I'll pass the buck to @lambdageek who will hopefully be able to address and fix the issue, thanks again!
There's not much for me to do, other than collect all the backports and shepherd them in: @tmijieux did I great job investigating and fixing the underlying issue.
main
: https://github.com/mono/mono/pull/213842020-02
: https://github.com/mono/mono/pull/21391main
: https://github.com/dotnet/runtime/pull/63293release/6.0-maui
: https://github.com/dotnet/runtime/pull/63296@jonpryor @grendello The runtime fixes are in mono 2020-02 and dotnet release/6.0-maui. For !NET6 bump to mono/mono@a5d1934898bfdf06662cee5799782b09ce8afe5a
Thanks again @tmijieux !
Glad to see this fixed so quickly 🙂 any idea when we'll see a release with this fix included (currently a pretty large crasher for us)?
Anybody knows a Xamarin.Android version that works with .NET 5? I can reproduce on Pixel 4 (5G) - Android 12. The workaround with: "MONO_GC_PARAMS=bridge-implementation=new,mode=throughput" did not help.
@FelixZY i am building the app with v. 6.12.0.164 but it does not fix: https://stackoverflow.com/questions/70223786/is-this-sigabrt-crash-in-android-app-caused-by-xamarin-log-handler We probably need a new Issue. Do we need to build with a different Xamarin-Android version too?
Android application type
Classic Xamarin.Android (MonoAndroid11.0, MonoAndroid12.0, etc.)
Affected platform version
VS2022 17.0.1, 17.0.2 VS2019(latest, probably https://docs.microsoft.com/fr-fr/visualstudio/releases/2019/release-notes#16.11.5 but i uninstalled it since)
Description
I am currently under investigation of a native crash very similar to https://github.com/xamarin/xamarin-android/issues/3892
the relevant part of the log seems to be this message
Assertion: should not be reached at /Users/builder/jenkins/workspace/archive-mono/2020-02/android/release/mono/sgen/sgen-scan-object.h:91
(see here this is actually a header file that contains preprocessor templated code and is included in multiple position in the code) that seems to indicate some portion of theorically unreachable code was reached in the code that is scanning for references in the sgen garbage collector, specifically it looked for an field in a native struct to determine what type of object it is currently looking at, but the switch did not match any case which seems to indicate what it is looking at is not what it expected to be (this could be maybe memory corruption?)My intuition for now is that is has something to do with some c# code we introduced (tough i am not 100% sure about that and my rational self rather believe that it is unlikely since it looks like a native crash). It started crashing in a production release (we never reproduced during development until it happened in production). In the new release, we did update some nugets but even after reverting the nuget versions, the crash was still there. But when we rebuilt (with the same visual-studio/xamarin-android version that was producing the crashing builds) an old version that we knew was not triggering this crash initially, then the new build of that old version was still not crashing.
At first we did thought the crash was more likely to happen under low memory condition (and it is likely because it happens when gc is running) and we looked for memory leaks. We found some but even after fixing most of them the crashes are still there.
At the times when we were looking for leaks, we tried to do a "git bisect" to find where the problem was coming from but we were more focused on the leaks when we did this, so I think I should probably retry to do a "git bisect" focusing on trying to trigger the crash (but this is a little problematic because we did not found a way yet to reproduce this issue systematically on any of our test devices)
Things i've looked at that I thought could have been related but does not really look similar: https://docs.microsoft.com/en-us/xamarin/android/release-notes/11/11.1#corrected-garbage-collection-behavior-for-android-bindings-and-bindings-projects
Our app is a xamarin.forms app there is the list of nugets we use
xml package references
```xml
all
runtime; build; native; contentfiles; analyzers; buildtransitive
./OpenId.AppAuth.Android.dll
```
So we have a few bindings libraries (com.onesignal, openid.appauth,...) and the main native library here is SkiaSharp.
What i am currently stuck at: I succeed at binding gdb on my app like describe here (https://github.com/xamarin/xamarin-android/blob/main/Documentation/workflow/DevelopmentTips.md#attaching-gdb-using-visual-studio-on-windows) and i think i also got my app to trigger my crash once, but there was virtually no information when printing the backtrace (just interrogation mark)
My current goal is to build xamarin-android to get to natively debug my app and try to get more information out of it (with debug symbols in mono sgen and stuff, i would like to have Address and undefined behavior sanitizer on mono and skiasharp if possible )
I successfully built xamarin-android and xabuild (I checkout the d17-0 branch because i wanted to get the same version i have in my current visual studio 2022, is that a good idea or not?) I had to change to platformtarget of xabuild.csproj to x64 because the msbuild in vs2022 seems to be 64bits. if i run xabuild on the samples/HelloWorld project i can build and deploy an app but if I add a packagereference to xamarin.forms
diff
```diff diff --git a/samples/HelloWorld/HelloWorld.csproj b/samples/HelloWorld/HelloWorld.csproj index 2b5391ee..092a8090 100644 --- a/samples/HelloWorld/HelloWorld.csproj +++ b/samples/HelloWorld/HelloWorld.csproj @@ -52,6 +52,7 @@{6BE66B30-9346-4DA6-B09A-0CDC1DFE33C2}
HelloLibrary
+
@@ -76,4 +77,4 @@
-
\ No newline at end of file
+
diff --git a/samples/HelloWorld/MainActivity.cs b/samples/HelloWorld/MainActivity.cs
index 43d1421e..3549205d 100644
--- a/samples/HelloWorld/MainActivity.cs
+++ b/samples/HelloWorld/MainActivity.cs
@@ -1,6 +1,8 @@
-using Android.App;
+using Android.App;
using Android.Widget;
using Android.OS;
+using Xamarin.Forms.Platform.Android;
+using Xamarin.Forms;
namespace HelloWorld
{
@@ -9,24 +11,22 @@ namespace HelloWorld
Label = "HelloWorld",
MainLauncher = true,
Name = "example.MainActivity")]
- public class MainActivity : Activity
+ public class MainActivity : FormsAppCompatActivity
{
int count = 1;
protected override void OnCreate (Bundle savedInstanceState)
{
- base.OnCreate (savedInstanceState);
-
+ base.OnCreate(savedInstanceState);
// Set our view from the "main" layout resource
- SetContentView (Resource.Layout.Main);
-
+ // SetContentView (Resource.Layout.Main);
// Get our button from the layout resource,
// and attach an event to it
- Button button = FindViewById
Somehow the referenced assemblies from the nuget does not get added to the csc command line and the project fails to build (
C:\src\xamarin-android\samples\HelloWorld\MainActivity.cs(4,15): error CS0234: The type or namespace name 'Forms' does not exist in the namespace 'Xamari n' (are you missing an assembly reference?) [C:\src\xamarin-android\samples\HelloWorld\HelloWorld.csproj]
) just like what would happen if the packagereference was not there, The same issue happens with my own project so i am currently unable to build my project with xabuild. (maybe I have something in my env that is hindering correct behavior or there is an issue with xabuild itself? if someone have an idea about this issue that would be helpful, i attached a binlog for the modified HelloWorld) msbuild.binlog.zipfor items mentionned in the referenced issue: I tried disabling the concurrent garbage collector, but the app suffered a slowdown, and it did not seems to fix the crashes. The crash seems to happens even on debug builds, not only on appstore releases, but maybe less often... What I did not try yet:
Steps to Reproduce
It is still very hard even for our team to reproduce (especially on emulator where it seems to happens very rarely)
Did you find any workaround?
Not yet, reverting my app to an old version did the trick for now but we cannot go forward until we find what is causing this.
Relevant log output
development environment information
``` Microsoft Visual Studio Community 2022 Version 17.0.2 VisualStudio.17.Release/17.0.2+31919.166 Microsoft .NET Framework Version 4.8.04161 Installed Version: Community Visual C++ 2022 00482-90000-00000-AA768 Microsoft Visual C++ 2022 ASP.NET and Web Tools 2019 17.0.793.11735 ASP.NET and Web Tools 2019 Azure App Service Tools v3.0.0 17.0.793.11735 Azure App Service Tools v3.0.0 C# Tools 4.0.1-1.21568.1+6ab6601178d9fba8c680b56934cd1742e0816bff C# components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used. Common Azure Tools 1.10 Provides common services for use by Azure Mobile Services and Microsoft Azure Tools. Extensibility Message Bus 1.2.6 (master@34d6af2) Provides common messaging-based MEF services for loosely coupled Visual Studio extension components communication and integration. Microsoft JVM Debugger 1.0 Provides support for connecting the Visual Studio debugger to JDWP compatible Java Virtual Machines Microsoft MI-Based Debugger 1.0 Provides support for connecting Visual Studio to MI compatible debuggers Microsoft Visual C++ Wizards 1.0 Microsoft Visual C++ Wizards Microsoft Visual Studio VC Package 1.0 Microsoft Visual Studio VC Package Mono Debugging for Visual Studio 17.0.11 (54f19d2) Support for debugging Mono processes with Visual Studio. NuGet Package Manager 6.0.1 NuGet Package Manager in Visual Studio. For more information about NuGet, visit https://docs.nuget.org/ ProjectServicesPackage Extension 1.0 ProjectServicesPackage Visual Studio Extension Detailed Info Test Adapter for Boost.Test 1.0 Enables Visual Studio's testing tools with unit tests written for Boost.Test. The use terms and Third Party Notices are available in the extension installation directory. Test Adapter for Google Test 1.0 Enables Visual Studio's testing tools with unit tests written for Google Test. The use terms and Third Party Notices are available in the extension installation directory. TypeScript Tools 17.0.1001.2002 TypeScript Tools for Microsoft Visual Studio Visual Basic Tools 4.0.1-1.21568.1+6ab6601178d9fba8c680b56934cd1742e0816bff Visual Basic components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used. Visual C++ for Cross Platform Mobile Development (Android) 17.0.31822.380 Visual C++ for Cross Platform Mobile Development (Android) Visual F# Tools 17.0.0-beta.21522.2+6d626ff0752a77d339f609b4d361787dc9ca93a5 Microsoft Visual F# Tools Visual Studio Code Debug Adapter Host Package 1.0 Interop layer for hosting Visual Studio Code debug adapters in Visual Studio Visual Studio IntelliCode 2.2 AI-assisted development for Visual Studio. Visual Studio Tools for CMake 1.0 Visual Studio Tools for CMake VisualStudio.DeviceLog 1.0 Information about my package VisualStudio.Foo 1.0 Information about my package VisualStudio.Mac 1.0 Mac Extension for Visual Studio Xamarin 17.0.0.341 (d17-0@ac52790) Visual Studio extension to enable development for Xamarin.iOS and Xamarin.Android. Xamarin Designer 17.0.0.182 (remotes/origin/d17-0@ea204898d) Visual Studio extension to enable Xamarin Designer tools in Visual Studio. Xamarin Templates 17.0.17 (9e779b0) Templates for building iOS, Android, and Windows apps with Xamarin and Xamarin.Forms. Xamarin.Android SDK 12.1.0.5 (d17-0/6b0e6b2) Xamarin.Android Reference Assemblies and MSBuild support. Mono: c633fe9 Java.Interop: xamarin/java.interop/d17-0@febb1367 ProGuard: Guardsquare/proguard/v7.0.1@912d149 SQLite: xamarin/sqlite/3.36.0@a575761 Xamarin.Android Tools: xamarin/xamarin-android-tools/d17-0@a5194e9 Xamarin.iOS and Xamarin.Mac SDK 15.2.0.17 (738fde344) Xamarin.iOS and Xamarin.Mac Reference Assemblies and MSBuild support. ```