dotnet / android

.NET for Android provides open-source bindings of the Android SDK for use with .NET managed languages such as C#
MIT License
1.92k stars 526 forks source link

Known samples for AndroidEnableMarshalMethods=true #8253

Open jonathanpeppers opened 1 year ago

jonathanpeppers commented 1 year ago

Android application type

.NET Android (net7.0-android, etc.)

Affected platform version

.NET 8 Preview 6

Description

We have turned off AndroidEnableMarshalMethods for .NET 8 due to too many issues surfaced during public testing. We plan on resolving those issues so it can be turned on for .NET 9.

This is just an issue of known examples that fail at runtime when AndroidEnableMarshalMethods=true:

Just linking them here for us to test later in future releases.

Steps to Reproduce

See links above.

Did you find any workaround?

No response

Relevant log output

No response

tranb3r commented 1 month ago

@grendello Why did you enable marshal methods by default in net9-pre6? #8925 Did you test the known samples before releasing this feature again? My sample #8147 is broken in net9-pre6, and once again, I have to manually disable marshal methods.

grendello commented 1 month ago

@tranb3r This is a very useful feature which gives a noticeable performance increase for huge majority of apps. The only known bad scenario we had was with Blazor apps, and those disable marshal methods by default. We enabled them in preview precisely for the reason of finding what else breaks and add it to the "disable" list if we can identify the issue. Note that marshal methods themselves are fine, the bug doesn't lie in their implementation as far as we can tell (we've had them around for nearly two years, they essentially implement in native code what is implemented in managed code when they are disabled) - the sole fact that they are that much faster triggers a race condition (at least in Blazor) of some sort. We haven't been able to determine what exactly trips up Blazor. If your sample has a pattern we can identify, we'd add it to the "disabled" list.

For the gain in 90% (or more) of apps out there, it is worth enabling the feature by default and disable it explicitly (or automatically, if the pattern is identified as explained above) for the remaining 10% (or less) apps.

grendello commented 1 month ago

@tranb3r as far as testing is concerned, we have hundreds of thousands of tests that run on CI and none of them fail with marshal methods. Without enabling the feature by default, we won't ever be able to gather information about what doesn't work "out there".

grendello commented 1 month ago

@tranb3r have you seen the problem with anything other than Plugin.Fingerprint?

tranb3r commented 1 month ago

For the gain in 90% (or more) of apps out there, it is worth enabling the feature by default and disable it explicitly (or automatically, if the pattern is identified as explained above) for the remaining 10% (or less) apps.

Sure. However, considering I'm facing at least 2 different issues in my app, caused by this feature, I'm a bit concerned. Race condition usually occur with complex scenarios, that are difficult to reproduce and integrate in CI.

As far as testing is concerned, we have hundreds of thousands of tests that run on CI and none of them fail with marshal methods.

This is why I'm asking if the samples in this issue have been tested. My sample is failing, so I suspect this one at least hasn't been investigated/tested.

Have you seen the problem with anything other than Plugin.Fingerprint?

Yes. ANR when doing http requests. However I haven't been able to create a simple repro for it.

bcaceiro commented 1 month ago

This option should only be used in .net9 previews right? It is supposed to be somewhat broken on .net8? Because I have tried in the previous maui - sr release and also had some issues, and disabled it.

grendello commented 1 month ago

@bcaceiro it was present in net8 but was off by default, correct.

grendello commented 1 month ago

For the gain in 90% (or more) of apps out there, it is worth enabling the feature by default and disable it explicitly (or automatically, if the pattern is identified as explained above) for the remaining 10% (or less) apps.

Sure. However, considering I'm facing at least 2 different issues in my app, caused by this feature, I'm a bit concerned. Race condition usually occur with complex scenarios, that are difficult to reproduce and integrate in CI.

Superficial simplicity is unfortunately par for the course in a lot of .NET. One example of that is Blazor apps, another is async/await or LINQ. There is an awful lot of asynchronous or simply complicated code in those features that can fail under some circumstances without any obvious signals as to what's going on. Your sample uses an external service which definitely can introduce threading and/or race condition scenarios. That said, it's much, much simpler than the Blazor "hello world" app that we were investigating. I need to bring some tools up to date and hopefully your sample will give a clue as to what's failing and why.

As far as testing is concerned, we have hundreds of thousands of tests that run on CI and none of them fail with marshal methods.

This is why I'm asking if the samples in this issue have been tested. My sample is failing, so I suspect this one at least hasn't been investigated/tested.

The sad truth is that we don't have much time for manual testing. However, product previews are exactly where we can hope people like you, early testers, will discover issues and let us know about them. Alas, there are very few people following your footsteps and sometimes we learn about issues very late in the game. If we discover that marshal methods cause more trouble than we thought, we're going to disable them by default before the release and hopefully re-enable for .NET 10 - the feature is too valuable to just drop it (and we have more optimizations in mind that build on top of it).

Have you seen the problem with anything other than Plugin.Fingerprint?

Yes. ANR when doing http requests. However I haven't been able to create a simple repro for it.

That's another scenario involving threads and synchronization, open for race conditions. Hopefully all of them are caused by the same issue and maybe your sample app will help us pin point the problem and swat it for good.