dotnet / maui

.NET MAUI is the .NET Multi-platform App UI, a framework for building native device applications spanning mobile, tablet, and desktop.
https://dot.net/maui
MIT License
22.28k stars 1.76k forks source link

App crash after navigating back and forth to a page with many UI elements #24008

Open AdiCalinHTM opened 3 months ago

AdiCalinHTM commented 3 months ago

Description

The app crashes after repeated navigations to and away from a page with many UI elements. If we try to open a page with fewer UI elements, the crash will occur after more tries/navigations.

The bug was originally found on 8.0.4 version. We checked also on 8.0.7 and managed to reproduce it there as well. This was received in the test app

Screenshot 2024-08-02 at 16 58 15

In our production app, we also caught this exception: Screenshot 2024-07-30 at 15 31 16

In our production app, using the Android Studio profiler, we've also found memory leaks

image (1) image (2) image

From our findings it's seems to be a memory leak/memory management problem

Steps to Reproduce

  1. Clone the https://github.com/AdiCalinHTM/MemoryLeakTestApp
  2. Deploy the app on a real iOS device. It won't reproduce on simulators
  3. Navigate back and forth between the two pages of the app. The loading of the second page could take longer because there are many UI elements that needs to be loaded.
  4. Repeat navigation until the app crashes

Link to public reproduction project repository

https://github.com/AdiCalinHTM/MemoryLeakTestApp

Version with bug

8.0.70 SR7

Is this a regression from previous behavior?

Yes, this used to work in Xamarin.Forms

Last version that worked well

Unknown/Other

Affected platforms

iOS, Android

Affected platform versions

Independent of OS version

Did you find any workaround?

We've tried many workarounds and hacks, ranging from simplifying the pages to setting ConfigureAwait(false) to all the navigation task in our app. In the test app, calling the GC.Collect() after GoToAsync() seems to solve the issue, but in our production app, this did not work

Relevant log output

No response

github-actions[bot] commented 3 months ago

Hi I'm an AI powered bot that finds similar issues based off the issue title.

Please view the issues below to see if they solve your problem, and if the issue describes your problem please consider closing this one and thumbs upping the other issue to help us prioritize it. Thank you!

Closed similar issues:

Note: You can give me feedback by thumbs upping or thumbs downing this comment.

jfversluis commented 3 months ago

@jonathanpeppers ?

jonathanpeppers commented 3 months ago

@AdiCalinHTM what type is responsible for the leak? Your screenshot above shows Android, but this is an iOS-only issue?

Your sample also doesn't have any GC.Collect() calls for testing, did you already try the investigation mentioned here?

After adding some GC.Collect() calls for testing (remove them later), you should be able to collect two .gcdump files and diff them in Visual Studio. Thanks!

kevinxufei commented 3 months ago

I can repro this issue at Android & iOS platforms on the latest 17.6.13(build 424) (8.0.71 & 8.0.70). iOS: the app will crash. Android: the app will freeze. All test results are on physical devices

AdiCalinHTM commented 3 months ago

@AdiCalinHTM what type is responsible for the leak? Your screenshot above shows Android, but this is an iOS-only issue?

We did manage to reproduce it on both platforms.

Your sample also doesn't have any GC.Collect() calls for testing, did you already try the investigation mentioned here?

We've tried to do some memory leak testing initially, on iOS, but with no succes. During some test, we observed that calling GC.Collect() when navigating was stoping the crash

After trying some of the things listed here, I think that just calling GC.Collect() somewhere in a .xaml.cs file is preventing the crash.

After adding some GC.Collect() calls for testing (remove them later), you should be able to collect two .gcdump files and diff them in Visual Studio. Thanks!

For now, I've pushed to the test repo some .gcdump files that I've recorded and the GC.Collect() calls, if somebody whats to try the test. (Take the previous version or just delete the GC.Collect() calls to reproduce the issue ) I tried to compare them, but didn't notice anything that looked like a memory leak. However, I'm not sure if I did the recording properly. I've used this tool https://learn.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-gcdump and I'm using a Mac for deploying the app and recording, but I've opened the .gcdump files on Windows, with VisualStudio. Sorry for the late response and thank you for reply :)

jonathanpeppers commented 3 months ago

After trying some of the things listed here, I think that just calling GC.Collect() somewhere in a .xaml.cs file is preventing the crash.

With the symptoms you are describing, it sounds like there is not actually a leak, and the GC is just not able to keep up with this particular screen/UI. If calling GC.Collect() is avoiding a crash, you might use that as a workaround temporarily, but I'm not a fan of having to use that.

I think I've caught up today, so I'll test the sample and see what I find.

jonathanpeppers commented 3 months ago

The thing I'm noticing, is the general performance of this page -- doesn't seem fast to me.

One thing you could do for now, is take a pass at simplifying the XAML, I noticed the "heaviest" part is:

image

image

If the page has many OneLineCell instances, can the number of views inside each one be less?

So, I wonder if you could reduce:

I noticed there are many duplicated lines like:

LeadingImageSource="{Binding Cell.LeadingImageSource}"
LeadingInactiveImageSource="{Binding Cell.LeadingInactiveImage}"
TrailingImageSource="{Binding Cell.TrailingImageSource}"
TrailingInactiveImageSource="{Binding Cell.TrailingInactiveImageSource}"

So, I wonder if this could be set in C# OnBindingContextChanged instead of creating so many bindings.

The other thing I see, maybe MauiCALayer leaks, I will investigate that:

image

jonathanpeppers commented 3 months ago

I am not able to find a problem with MauiCALayer:

I tried it in a test, and MauiCALayer is successfully GC'd.

@AdiCalinHTM if you are successful in simplifying the sample, let me know and I can profile again. I will try this as well, but I would probably be commenting/removing code, which won't be exactly the same.

AdiCalinHTM commented 3 months ago

I've pushed the changes to the repo. I've simplified the sample by deleting all the bindings in OneLineCellPage and also deleting a good chunk of the oneLineCells in that page. The crash is still reproducible. More elements can be removed from that page and loading will be faster, but the crash will still happen after multiple navigations. This is the behavior that we also see on other components and UI pages that we've made and use. If it's more complex, the crash will happen faster. I've also tried to follow you other recommendations. That is the most I can do to still keep a form similar to the one we use in our production code. After debugging, all I can see is this error

Screenshot 2024-08-07 at 10 25 00
jonathanpeppers commented 3 months ago

Ok, @AdiCalinHTM, thank you for the changes! That eliminates some of the noise, and I noticed there are <Frame/> objects used in your XAML. There may be a memory leak in FrameRenderer, the underlying type. FrameRenderer (or any "renderer") is in a "compatiblity" namespace, so I suspect this is only kept around to ease Xamarin.Forms migration.

If you use <Border/> instead of <Frame/>, does that improve things for you?

AdiCalinHTM commented 3 months ago

Hey, @jonathanpeppers, thank you!

I've made a conversion from <Frame/> elements to <Border/> and pushed the changes to the repo. The issue is still reproducible and it seems to reproduces a little easier/faster after this change.

This morning and after this change, was the first time when I saw this error:

error: * Assertion at /Users/runner/work/1/s/src/mono/mono/utils/lock-free-alloc.c:145, condition `sb_header' not met, function:alloc_sb, Failed to allocate memory for the lock free allocator

Screenshot 2024-08-08 at 11 05 30

I don't see it after every crash and still, the most common error that I've got is the one mentioned above, about the Garbage collector.

I also need to mention that I see this Native Crash Reporting after about every crash. I don't know if this helps with anything (after cleaning bin and obj this Native Crash Reporting does not appear in the first deploy and crash sequence)

Screenshot 2024-08-08 at 11 05 54
jonathanpeppers commented 3 months ago

@AdiCalinHTM how many times do I need to navigate back and forth to see the OutOfMemoryException or crash? Before, I was just doing it a few times and taking a diff of .gcdump files, and it never crashed for me.

Should I be testing in the iOS simulator or on an iOS device?

AdiCalinHTM commented 3 months ago

@AdiCalinHTM how many times do I need to navigate back and forth to see the OutOfMemoryException or crash? Before, I was just doing it a few times and taking a diff of .gcdump files, and it never crashed for me.

@jonathanpeppers on an average, 6 navigations will lead to a crash. Could go up to 12 sometimes, in my case, but I'm using an iPhone XR, real device, for testing. On a newer iPhone, it would probably need a few more navigations back and forth to reproduce it or you could add more oneLineCells in that page and will crash faster.

We cannot reproduce the crash on simulators.

The OutOfMemoryException was caught in our production app. In the test app we got the abovementioned errors along with the crash.

PureWeen commented 3 months ago

If you're able to test this on net9 preview7 that would be helpful.

BjoernMSG commented 3 months ago

@PureWeen @jonathanpeppers I've tested it on dotnet9 (Preview 7) and seems to be no difference. Im still able to reproduce the crash after a few navigations.

Do you have any idea how we can proceed here? this bug is blocking us to release our maui app so it is quite important topic.

If helpful maybe we can arrange a short discussion via teams to find a solution or at least a workaround

jfversluis commented 3 months ago

@BjoernMSG thanks for trying that out! I have pinged Jonathan again and see if there are any more ideas he might have and see if he can test on a physical device.

Just for some background information. Is this something you're discovering now? Did this not happen in earlier .NET MAUI versions? I see you mention 8.0.40, but were you using something before that? If so, what is the version where you didn't see this behavior?

I guess the reason why it only happens on physical devices, or at least sooner on physical devices, is that the Simulator takes the full memory available on your development machine so it will take longer to hit that limit.

AdiCalinHTM commented 3 months ago

Just for some background information. Is this something you're discovering now? Did this not happen in earlier .NET MAUI versions? I see you mention 8.0.40, but were you using something before that? If so, what is the version where you didn't see this behavior?

Hi @jfversluis! I've tried to downgrade the MAUI version to 8.0.0, 8.0.10 in the test app and when I deploy, it just crashes When we started the migration, this two where the first versions that we used in our production app and ever since then, we had frequent crashes and bugs of all kinds. We managed to solve most of them, but this crash is a problem we can't link to anything else. I think the problem was always there, but can't be sure.

I guess the reason why it only happens on physical devices, or at least sooner on physical devices, is that the Simulator takes the full memory available on your development machine so it will take longer to hit that limit.

This is also our guess.

davishoang96 commented 3 months ago

We're having the similar issue to our internal app. We tried to use the app 4 times, then it crashed on iOS, frozen on Android.

jfversluis commented 3 months ago

There must be something that triggers this that is specific to your situations. It would, as always, be super helpful if you could somehow create a reproduction project for this. Then we won't have to guess about what might be going on here but we can actually have a look at test things out on our side.

BjoernMSG commented 3 months ago

@jfversluis https://github.com/AdiCalinHTM/MemoryLeakTestApp With this project you should be able to reproduce the issue. Ive created an appium test and most time it crash with less then five navigation to the subpage and back to the main page (deployed on real ios device)

jfversluis commented 3 months ago

@BjoernMSG oh sorry, seems I missed that in the opening comment, my bad. Besides providing this and showing how it happens, did you play around with removing elements? I see from the above discussion that you might have already tried replacing some elements, or while putting this repro together, when was the moment that it actually started breaking?

Did you try alternatives here by not using a ScrollView but maybe a CollectionView or a BindableLayout or something? Just thinking out loud here to on the one hand try to get closer to the actual cause and at the same time hopefully unblock you so you can move forward with this.

AdiCalinHTM commented 3 months ago

@BjoernMSG oh sorry, seems I missed that in the opening comment, my bad. Besides providing this and showing how it happens, did you play around with removing elements? I see from the above discussion that you might have already tried replacing some elements, or while putting this repro together, when was the moment that it actually started breaking?

Nope, there was not a moment, after adding the second page, where the crash would not reproduce.

I deleted everything related to the images in the oneLineCell component and from the elements that compose it because, initially, we thought that was where the problem would come from.

We tried to put fewer elements on the page and in that case the crash was still reproducing, it just took a few more navigations.

We initially used many binding in the OneLineCellPage and had some frame elements, which are obsolete. We replaced frame elements with boarder and removed all bindings and the crash still reproduces.

I replaced the ScrollView's with VerticalStackLayout and Grids the crash is still happening with the same frequency. I've put all the OneLineCell elements in a Grid and then in a VerticalStackLayout, with no other container, and the crash reproduces the same way.

We did some digging, but the crash keeps reproducing. The version of the test app in that repo is using just standard, up-to-date, MAUI controls and it's very simple.

jonathanpeppers commented 3 months ago

I tested the above sample on an iPhone 12 Pro, and I can see it crash after running out of memory.

After testing a bit more, I don't think there is a leak here after changing <Frame/> to <Border/> -- the app is just using too much memory.

OneLineCellPage has 46 OneLineCell, and OneLineCell has 100 lines or so of XAML and child controls.

So, I tried removing most of the code in OneLineCell:

With this change, the app doesn't run out of memory for me -- I navigated 10-20 times with no issue. The page also navigates instantly, where there is a noticeable slowdown creating the page before.

Unfortunately, the app isn't too useful with all the code removed!

Thoughts on a Solution

Some ideas, I see to make this page work currently:

This might take a while...

In the meantime, is there a possibility:

It seems like with a little work, could get OneLineCell to use maybe 25% less child views, etc. This might be enough for the sample to not run out of memory.

I will profile a bit more and see if I come up with other ideas.

BjoernMSG commented 3 months ago

We have a finding which seems to "solve" the problem. If we wrap in the "OneLineCell" the complete grid into a ContentView, the crash seems to not happen anymore. Ive tested it with an appium test and 100 times navigate to the subpage and back again.

Do you have an idea why this solve the problem? @jonathanpeppers

jonathanpeppers commented 3 months ago

I think Grid has an internal data-structure for each child, maybe it uses a lot of memory:

https://github.com/dotnet/maui/blob/ebf760463bae3f7e03e5ad41d5bc57852c114235/src/Core/src/Layouts/GridLayoutManager.cs#L54-L83

Thanks for this hint, I was just about to profile more -- so I'll test the difference between Grid and ContentView.

jonathanpeppers commented 3 months ago

@BjoernMSG are you able to share the changes for the sample:

The main type I see taking up memory are Microsoft.Maui.Controls.SetterSpecificityList objects, but it doesn't seem like Grid would be any worse than ContentView for this. All BindableProperty are backed by this list.

.NET 9 has this change related to SetterSpecificityList:

So, we could check if .NET 9 fares any better for this app.