dotnet / maui

.NET MAUI is the .NET Multi-platform App UI, a framework for building native device applications spanning mobile, tablet, and desktop.
https://dot.net/maui
MIT License
22.15k stars 1.74k forks source link

Memory leak when VisualElement are no longer in view on Android #21853

Open peirens-bart opened 6 months ago

peirens-bart commented 6 months ago

Description

I created pages and navigated from the mainpage to them and back when navigating back the page is removed from the visual tree but the app memory goes up. when I change the page di setup from singleton to transient you see that even the 100Mb that I put on page 2 is not collected from memory when the page is removed from the visual tree, I need to set the variable to null. so when you have a transient page storing a big variable the page is not collected when removed from the visual tree. when switching to singleton the memory of the app is still growing even if the page is not in the visual tree and should be added once, I think that that the memory can only go up 1 time when the singleton is created.

this is only on Android.

Screenshot navigating from mainpage to page1 and back, you see the grey line going up. this is a problem whit an industrial app we want to make that needs to run 10H. at some point the app is lowing down due to this. image

Steps to Reproduce

  1. run sample app
  2. change di from singleton to transiant

Link to public reproduction project repository

https://github.com/peirens-bart/MauiMemory

Version with bug

8.0.20 SR4

Is this a regression from previous behavior?

Yes, this used to work in Xamarin.Forms

Last version that worked well

Unknown/Other

Affected platforms

Android

Affected platform versions

No response

Did you find any workaround?

No, I can make a gcdump (always getting a write error) to see what is it that is stored in memory. My suspension is that the used controls are not disposed.

Relevant log output

No response

marco-skizza commented 6 months ago

Hi @peirens-bart

I may be wrong, but I would be surprised if such simple pages still leaked.

Have you already seen the following page: https://github.com/dotnet/maui/wiki/Memory-Leaks

Basically what I would try:

As a staring point you can look at my code for another iOS issue: https://github.com/marco-skizza/MauiPageMemory

I hope I'm not misguiding you... 😃

peirens-bart commented 6 months ago

Hi @marco-skizza

I already added GC.Collect() in the navigationservice when clearing the stack. but based on your example I will print the memory on the mainpage protected override void OnAppearing() { base.OnAppearing(); var totalMemory = GC.GetTotalMemory(true); memTxt.Text = $"Memory: {totalMemory}"; } the first screen is after bood, the second is after going to page 2 and back (first time creating the singleton) the third is after 10 times going to page2 and back image image

image

for the other tooling I always have this exception image

marco-skizza commented 6 months ago

Hi @peirens-bart

I added some logging for the deconstructor of the pages:

    ~Page2()
    {
        Console.WriteLine("~Page2() called");
    }

That gets logged when I set the pages to transient - and even when removing the _big = null; in Page2.

But maybe there are also memory leaks besides pages not being garbage collected...?!

peirens-bart commented 6 months ago

Hi @marco-skizza

I added a Page3 to demonstrate something that can be that other thing It have to do with the fact that OnApearing is called while the page is disappearing just before going home image after go home I don't expect that OnAppearing is called at that time. image

This is also an issue when the add would need some time because the go back will feel slow.

marco-skizza commented 6 months ago

Hi @peirens-bart

I don't know if that's even a different problem. Because this doesn't seem to happen when calling Page1 or Page2 with the memory leak.

But I better leave this to the official .NET MAUI team...

marco-skizza commented 6 months ago

Hi @peirens-bart

P.S.: Regarding the navigating back by manually removing pages from the navigation stack:

I can imagine this is a desired behavior: Because when removing page by page from the stack, each time the underlying page gets the current one and seemingly "activated".

You could prevent this additional calls to OnAppearing and OnDisappearing by navigating back e.g. with route ../../.. (when navigating back three pages)...

peirens-bart commented 6 months ago

@marco-skizza

That is correct but then I need to track how many navigations I did. at the moment we have different pages each making a view for a step that represents the work done in the physical world. this means sometimes we have to prevent "Go Back" by the physical keyboard on the Android device because the workflow can only go forward. since every flow has a different number of steps to process we navigate back to to start when starting a new loop. a "Clear" method can also help.

The easy part is that we can set a page as that active view without the shell but if we do that we always have a white blinking screen in between.

Larhei commented 6 months ago

@peirens-bart for your tooling exception. What nobody is telling that it only works when moving you powershell to an directory you can write to. c:\Program Files\xxxx should be write protected... So cd your way to a Directory like c:\temp than it should work

AdamEssenmacher commented 6 months ago

I don't see a leak here.

If you want to get the GC to behave deterministically, you need to call GC.Collect() several times.

peirens-bart commented 6 months ago

@peirens-bart for your tooling exception. What nobody is telling that it only works when moving you powershell to an directory you can write to. c:\Program Files\xxxx should be write protected... So cd your way to a Directory like c:\temp than it should work

sorry same result, can you make on how to do it using and Android emultaor? image

peirens-bart commented 6 months ago

I see a lot of Maui related UI objects that are not getting disposed when a page is removed from the visual tree image after some navigation in our app (3') image

We went live this weekend but it is terrible at the moment every 2 hours users need to reboot or wait for the OOM to happen. Anyone having an idea on how to fix this or dispose this content related objects or knows a good alternative for an Android App writing in C#?

AdamEssenmacher commented 6 months ago

https://github.com/AdamEssenmacher/MemoryToolkit.Maui

That might help you track down where your leaks are happening.

peirens-bart commented 6 months ago

https://github.com/AdamEssenmacher/MemoryToolkit.Maui

That might help you track down where your leaks are happening.

I tried that but nothing came up. I think this is related to #21007

@SittenSpynne You nailed it in this discussion https://github.com/dotnet/maui/discussions/21918#discussion-6524507 image

peirens-bart commented 5 months ago

I updated the title and the example project because we investigated the issue on our side and it has nothing to do with navigation but with the visual state of the element. so the new example is without navigation but is just putting a blinky rectangle on the screen


        public App()
        {
            InitializeComponent();

            MainPage = new Pages.Page4();// new AppShell();
        }

then Page 4 look like

<ContentPage xmlns="http://schemas.microsoft.com/dotnet/2021/maui"
             xmlns:x="http://schemas.microsoft.com/winfx/2009/xaml"
             x:Class="MauiMemory.Pages.Page4"
             Title="Page4">
    <VerticalStackLayout>
        <Label 
            Text="Blinky App"
            VerticalOptions="Center" 
            HorizontalOptions="Center" />
        <Rectangle x:Name="Blinker"
            BackgroundColor="Red"
            WidthRequest="50"
            HeightRequest="50"
            VerticalOptions="Center">

        </Rectangle>
    </VerticalStackLayout>
</ContentPage>

With a simple timer to create a blinky app

public partial class Page4 : ContentPage
{
    private Timer? _timer;
    public Page4()
    {
        InitializeComponent();
    }

    protected override void OnAppearing()
    {
        base.OnAppearing();

        _timer = new Timer(callback =>
        {
            Dispatcher.Dispatch(() =>
            {
                Blinker.IsVisible = !Blinker.IsVisible;
            });
        }, null, 0, 1000);
    }

    protected override void OnDisappearing()
    {
        base.OnDisappearing();
        _timer?.Dispose();
    }
}

after 5 minutes the memory footprint looks like this, even when I forced a GC via the profiling tool the memory is not like the original when the page was first rendered.

image

jonathanpeppers commented 4 months ago

I forced a GC via the profiling tool the memory is not like the original

So, Android Studio's "force a GC button" won't trigger a .NET GC. Android doesn't know what .NET is.

I'm reviewing the sample now, though; will report what I find.

jonathanpeppers commented 4 months ago

@peirens-bart what object in your sample do you think is leaking?

I made these changes to assist debugging this:

I see the log message:

06-04 13:51:38.077 15995 16013 I DOTNET  : Page3 Destructor
06-04 13:51:44.541 15995 16013 I DOTNET  : Page3 Destructor

Which indicates that Page3 is fine. I was just navigating back and forth to see this.

The other pages live forever, because you registered them as singletons:

builder.Services.AddSingleton<AppShell>();
builder.Services.AddSingleton<MainPage>();
builder.Services.AddSingleton<Page1>();
builder.Services.AddSingleton<Page2>();
builder.Services.AddTransient<Page3>();

What C# object should I be looking for here? I should be able to compare two gcdump files and diff them.

peirens-bart commented 4 months ago

@jonathanpeppers I don't think it is a page that is leaking but the visuals on the page. can you have a look at Page4, that is creating a red rectangle that is blinking, everytime the rectangle hide, I think the visual element is removed from the visual tree but not removed from memory. when making it vissible again a new one is created. I can be wrong not sure how to debug that. Assuming that what I think is happening is correct and looking back to the navigation, when navigating away from a page, the page itself is disposed but the elements on the page are removed from visual tree but not from the memory. For some reason I can't created a dump with the tools on my machine.

jonathanpeppers commented 4 months ago

What type do you think is leaking? Namespace and type name?

Should I navigate around, or just let the page sit a while blinking?

jonathanpeppers commented 4 months ago

I did a diff before/after navigating to Page3 three times:

image

These all look like they would go away on the next GC.

peirens-bart commented 4 months ago

What type do you think is leaking? Namespace and type name?

Should I navigate around, or just let the page sit a while blinking?

Just wait 5 minutes before take the diff, can you share the setup you ar using to see the dump in vs?

peirens-bart commented 4 months ago

@jonathanpeppers If you have some time we can have a short call about it?

jonathanpeppers commented 4 months ago

I am following the guide here, linked from https://aka.ms/profile-maui:

Just wait 5 minutes before take the diff

So test Page4? and wait a while? I made this change:

But so far it seems fine:

image

I keep waiting, but there are only a few fresh objects until the next GC runs again.

image

peirens-bart commented 4 months ago

I followed the same guide but the tool is giving exceptions on my machine

I am following the guide here, linked from https://aka.ms/profile-maui:

https://github.com/xamarin/xamarin-android/blob/main/Documentation/guides/tracing.md#memory-dumps-for-android-in-net-8

What is the overall memory of the app between dumps, and is it leaking or is something else leaking outside the maui code? can you put the dumps on the tickets?

peirens-bart commented 4 months ago

@jonathanpeppers I see that you are calling gc when the timer is elapsing, is there any reason for that? Can you make a 5' diff when not calling gc and see if that is making a difference?

jonathanpeppers commented 4 months ago

The GC.Collect() call is to just "debug" memory problems. Otherwise the GC is non-deterministic and you'll never know if there is a leak or if the GC is just deciding not to run yet.

Here are my two recent dumps after the app sat blinking a while: blinking.zip

jonathanpeppers commented 4 months ago

Try these versions:

 > dotnet tool list -g
Package Id                         Version                       Commands
--------------------------------------------------------------------------------------
dotnet-dsrouter                    8.0.510501                    dotnet-dsrouter
dotnet-gcdump                      8.0.510501                    dotnet-gcdump
dotnet-trace                       8.0.510501                    dotnet-trace
peirens-bart commented 4 months ago

Try these versions:

 > dotnet tool list -g
Package Id                         Version                       Commands
--------------------------------------------------------------------------------------
dotnet-dsrouter                    8.0.510501                    dotnet-dsrouter
dotnet-gcdump                      8.0.510501                    dotnet-gcdump
dotnet-trace                       8.0.510501                    dotnet-trace

PS C:\Program Files\Microsoft Visual Studio\2022\Preview> dotnet-trace collect -p 50240
No profile or providers specified, defaulting to trace profile 'cpu-sampling'

Provider Name                           Keywords            Level               Enabled By
Microsoft-DotNETCore-SampleProfiler     0x0000F00000000000  Informational(4)    --profile
Microsoft-Windows-DotNETRuntime         0x00000014C14FCCBD  Informational(4)    --profile

[ERROR] System.IO.EndOfStreamException: Unable to read beyond the end of the stream.
   at System.IO.BinaryReader.InternalRead(Int32 numBytes)
   at System.IO.BinaryReader.ReadUInt16()
   at Microsoft.Diagnostics.NETCore.Client.IpcHeader.Parse(BinaryReader reader) in /_/src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsIpc/IpcHeader.cs:line 55
   at Microsoft.Diagnostics.NETCore.Client.IpcMessage.Parse(Stream stream) in /_/src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsIpc/IpcMessage.cs:line 117
   at Microsoft.Diagnostics.NETCore.Client.IpcClient.Read(Stream stream) in /_/src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsIpc/IpcClient.cs:line 107
   at Microsoft.Diagnostics.NETCore.Client.IpcClient.SendMessageGetContinuation(IpcEndpoint endpoint, IpcMessage message) in /_/src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsIpc/IpcClient.cs:line 44
   at Microsoft.Diagnostics.NETCore.Client.EventPipeSession.Start(IpcEndpoint endpoint, EventPipeSessionConfiguration config) in /_/src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsClient/EventPipeSession.cs:line 34
   at Microsoft.Diagnostics.Tools.Trace.CollectCommandHandler.Collect(CancellationToken ct, IConsole console, Int32 processId, FileInfo output, UInt32 buffersize, String providers, String profile, TraceFileFormat format, TimeSpan duration, String clrevents, String clreventlevel, String name, String diagnosticPort, Boolean showchildio, Boolean resumeRuntime, String stoppingEventProviderName, String stoppingEventEventName, String stoppingEventPayloadFilter, Nullable`1 rundown) in /_/src/Tools/dotnet-trace/CommandLine/Commands/CollectCommand.cs:line 275

You have any suggestion on the memory, or what I can do? Can I contact you directly?

jonathanpeppers commented 4 months ago

System.IO.EndOfStreamException: Unable to read beyond the end of the stream.

A coworker saw this, but reinstalled the dotnet tools, which might have fixed it? (Or could have been just trying again)

You have any suggestion on the memory, or what I can do?

I'm not seeing a problem here, maybe you can share what object you see leaking? If you can see something in Android Studio, which object is it?

peirens-bart commented 4 months ago

I'm not seeing a problem here, maybe you can share what object you see leaking? If you can see something in Android Studio, which object is it?

That is the thing I don't know. The only thing I can see in Android studio is that the memory is going up. in our production app we even got an OOM after a few hours. but I already added finelizers to all pages and they got hit and I used weakrefs for all events.My gues is that when using an object that is inherithing from 'VisualElement' that the element itself is not totaly disposed. I don't see the issue when running the same app on Windows, It's only when running the app on Android (real device or sim).

I added the heap dump from Android studio when running blinkt for 5'

memory-20240412T174604-001.zip

jonathanpeppers commented 4 months ago

I can see the two files, thanks.

Is 001 the first dump? image

Then 002 has: image

It seems like it went up by ~200 bytes, what is the time between the two? That might be "fine" and the two GCs (.NET and Java) are behind. I haven't found an easy way to diff these two files yet.

peirens-bart commented 4 months ago

I haven't found an easy way to diff these two files yet. Same here, part off the problem I have, and I can't take a dump with the tooling reinstalled all tools but same error, maybe conflict with something else on my machine.

I ran the blinky again at booth we start with 92.1 MB for the app process image

After 12' we have 101.2MB image

jonathanpeppers commented 4 months ago

@peirens-bart right, so if we can't see a Microsoft.Maui object alive, it's not an issue with .NET MAUI. This could be an Android bug?

Have you had any more luck with dotnet-gcdump?

jonathanpeppers commented 4 months ago

Can you also test with a physical device instead of an emulator?

peirens-bart commented 4 months ago

sorry for the delay I asked a collegue to run it on a device used in production image

image

image