dotnet / android

.NET for Android provides open-source bindings of the Android SDK for use with .NET managed languages such as C#
MIT License
1.93k stars 527 forks source link

Large Object Stack leaking memory #1558

Closed RowanHarris closed 6 years ago

RowanHarris commented 6 years ago

Steps to Reproduce

  1. Run the simple project
  2. Watch the android logcat and look for the garbage collections which show the large object used size
  3. If the size is not increasing, increase the size of the "Size" variable in MainActivity.cs, the larger your device physical memory the larger the size required to cause the LOS to start growing with each allocation.

TestMemory.zip

Expected Behavior

The LOS size should not increase when memory is allocated then released.

Actual Behavior

The LOS can keep growing, reporting the amount of memory used in the LOS is larger than the actual physical memory of the device.

Here is a more detailed description of the issue.

We have been having issues of the large object stack increasing in size and not releasing memory eventually causing out of memory exceptions. The less memory the device has the more likely it is to happen.

The large object stack also can report that its size is larger than the real physical RAM size.

In our real application it is very complex with thousands of memory objects of all sizes, the issue can happen with 5MB allocations when there are lots of other objects in the LOS.

However to make this repeatable I have narrowed it down to a very simple application. I created a new project and added just a hand full of lines of code to get this to repeat. Because the app is very simple the allocations needed to be much larger than in our app but the end result is the same.

while (true)
{
    activityManager.GetMemoryInfo(tmp);
    Log.Info("GetProcessMemoryInfo", "Size:" + size 
        + " GetTotMem:" + GC.GetTotalMemory(false) / 1000000f 
        + " avail:" + tmp.AvailMem / 1000000f + " total:" + tmp.TotalMem / 1000000f);

    var bff = new byte[size * 1000000];

    Thread.Sleep(2000);
}

I just add a thread that every 2 seconds does an allocation of memory, that is then ready for the garbage collector to dispose of it. The 2 seconds can be extended to 30 seconds but the result is the same the speed of the allocation seems to have no effect.

I then check the memory statistics and print them to the android log.

I print out the GC.GetTotalMemory(false) result as well as the total memory and free memory returned from activityManager.GetMemoryInfo

Every device has an allocation size that this issue starts to happen. The more physical memory the larger the allocations need to be to start the LOS growing. With a device with 1GB of RAM the allocation size tends to be around 100MB to cause the climb. On a device with 2GB It was around 300MB. On a device with 4GB, it was around 700MB. Basically I just start with a low value and keep increasing until the LOS starts to grow every allocation call. So far every device I have tested has shown the problem.

Each call to allocated memory can increase the LOS size, even if I force a CG.Collect() the garbage before the allocation the LOS size keeps growing. However the free memory reported by GetMemoryInfo does not reduce, so the free memory is not effected by this strange very large LOS. Just the LOS keeps thinking that it is growing. It can grow to be more than the physical memory size which is not possible.

After some time the app can Crash with out of memory exception however it never tried to kill any of the running applications like usually happens when the system is running out of memory, that is because OS thinks there is still plenty of free memory.

Yes the memory allocations are very high and not normal, however this is just to make it repeatable, very similar behavior was observed for smaller allocations in a complex application. So for a simple application the allocations needed to be much larger.

Version Information

Microsoft Visual Studio Community 2017 Version 15.6.6 VisualStudio.15.Release/15.6.6+27428.2037 Microsoft .NET Framework Version 4.7.02556

Installed Version: Community

Visual C++ 2017 00369-60000-00001-AA125 Microsoft Visual C++ 2017

ASP.NET and Web Tools 2017 15.0.40314.0 ASP.NET and Web Tools 2017

C# Tools 2.7.0-beta3-62715-05. Commit Hash: db02128e6e3c4bdfc93e6ec425ac9162b4d4fe80 C# components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used.

Common Azure Tools 1.10 Provides common services for use by Azure Mobile Services and Microsoft Azure Tools.

JavaScript Language Service 2.0 JavaScript Language Service

JetBrains ReSharper Ultimate 2017.3.5 Build 111.0.20180323.160610 JetBrains ReSharper Ultimate package for Microsoft Visual Studio. For more information about ReSharper Ultimate, visit http://www.jetbrains.com/resharper. Copyright © 2018 JetBrains, Inc.

Merq 1.1.17-rc (cba4571) Command Bus, Event Stream and Async Manager for Visual Studio extensions.

Microsoft JVM Debugger 1.0 Provides support for connecting the Visual Studio debugger to JDWP compatible Java Virtual Machines

Microsoft MI-Based Debugger 1.0 Provides support for connecting Visual Studio to MI compatible debuggers

Microsoft Visual C++ Wizards 1.0 Microsoft Visual C++ Wizards

Microsoft Visual Studio VC Package 1.0 Microsoft Visual Studio VC Package

Mono Debugging for Visual Studio 4.9.11-pre (71eb098) Support for debugging Mono processes with Visual Studio.

NuGet Package Manager 4.6.0 NuGet Package Manager in Visual Studio. For more information about NuGet, visit http://docs.nuget.org/.

ProjectServicesPackage Extension 1.0 ProjectServicesPackage Visual Studio Extension Detailed Info

Visual Basic Tools 2.7.0-beta3-62715-05. Commit Hash: db02128e6e3c4bdfc93e6ec425ac9162b4d4fe80 Visual Basic components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used.

Visual C++ for Cross Platform Mobile Development (Android) 15.0.27428.2015 Visual C++ for Cross Platform Mobile Development (Android)

Visual C++ for Linux Development 1.0.8 Visual C++ for Linux Development

Visual Studio Code Debug Adapter Host Package 1.0 Interop layer for hosting Visual Studio Code debug adapters in Visual Studio

Visual Studio Tools for CMake 1.0 Visual Studio Tools for CMake

VisualStudio.Mac 1.0 Mac Extension for Visual Studio

Xamarin 4.9.0.753 (f0f46392f) Visual Studio extension to enable development for Xamarin.iOS and Xamarin.Android.

Xamarin Designer 4.10.96 (a702a9ceb) Visual Studio extension to enable Xamarin Designer tools in Visual Studio.

Xamarin.Android SDK 8.2.0.16 (HEAD/a78295902) Xamarin.Android Reference Assemblies and MSBuild support.

Xamarin.iOS and Xamarin.Mac SDK 11.9.1.24 (f62de47) Xamarin.iOS and Xamarin.Mac Reference Assemblies and MSBuild support.

Log File

log 1GB device, 120MB alloc.txt

In this log each allocation at the beginning increases the GetTotMem variable, but the free memory does not change.

04-13 17:38:06.892: I/GetProcessMemoryInfo(9371): Size:120 GetTotMem:4.260584 avail:534.0979 total:925.1758 04-13 17:38:08.902: I/GetProcessMemoryInfo(9371): Size:120 GetTotMem:124.3471 avail:517.4641 total:925.1758 04-13 17:38:10.902: I/GetProcessMemoryInfo(9371): Size:120 GetTotMem:244.3472 avail:517.5911 total:925.1758 04-13 17:38:12.972: I/GetProcessMemoryInfo(9371): Size:120 GetTotMem:364.4317 avail:517.4231 total:925.1758 04-13 17:38:15.062: I/GetProcessMemoryInfo(9371): Size:120 GetTotMem:484.4317 avail:517.4231 total:925.1758

At the end of the log the last mem tag shows GetTotMem:1324.432 avail:517.0709 total:925.1758 so mono thinks there is 1324.432 MB in use, but the total physical memory is 925.1758 and the available memory is still reported as 517 which is the same as before the first allocation.

the last GC event shows that mono thinks the LOST used size is 1054695K which is more than the physical memory size. 04-13 17:41:20.282: D/Mono(9371): GC_MAJOR: (LOS overflow) time 13.59ms, stw 16.49ms los size: 1055716K in use: 1054695K

jonpryor commented 6 years ago

I can't directly repro what you're observing; I can partially repro it. With this code:

        void AllocGarbage()
        {
            var activityManager = (ActivityManager)this.GetSystemService(Android.Content.Context.ActivityService);
            var tmp = new ActivityManager.MemoryInfo();
            const int size = 120;
            while (true)
            {
                activityManager.GetMemoryInfo(tmp);
                Android.Util.Log.Info("GetProcessMemoryInfo", "Size:" + size
                    + " GetTotMem:" + System.GC.GetTotalMemory(false) / 1000000f
                    + " avail:" + tmp.AvailMem / 1000000f + " total:" + tmp.TotalMem / 1000000f);

                var bff = new byte[size * 1000000];

                System.Threading.Thread.Sleep(2000);
            }
        }

I get the following output when running on a Pixel 2:

06-08 13:34:01.540 I/GetProcessMemoryInfo( 6122): Size:120 GetTotMem:244.3891 avail:2489.68 total:3821.285
06-08 13:34:03.556 I/GetProcessMemoryInfo( 6122): Size:120 GetTotMem:244.3891 avail:2489.553 total:3821.285
06-08 13:34:05.572 I/GetProcessMemoryInfo( 6122): Size:120 GetTotMem:244.3891 avail:2490.188 total:3821.285
...

I hit steady-state at ~244MB, which is somewhat odd/interesting in that your log file appears to reach steady-state of around 800MB.

Your log file aside, an important question is: Why is GC.GetTotalMemory() returning ~244MB on my run? size is 120, so on every loop iteration we allocate a 120MB of data, so why isn't GC.GetTotalMemory() returning 120MB, or even 0MB?

The answer is that our GC conservatively scans the stack, unlike desktop .NET. Specifically, bff is not dead immediately after the allocation, and thus is not eligable for collection, until the next loop iteration, when bff is overwritten with the next array allocation.

What this means is that, for loop iterations such as that above, you can expect at least 2 arrays to exist at a given time before any collection is performed.

Further note that GC.Collect() will not help here; because the stack is scanned conservatively, there will always be an array reference which the GC will keep alive, prolonging its lifetime, and keeping your total memory use higher.

Additionally, it may be beneficial to "null out" the array variable:

                var bff = new byte[size * 1000000];
                bff = null;

This often won't do any good, but when I did that it reduced my memory use by half, to ~120MB.


In general, it may not a preferred development practice to allocate, use, and "drop"/stop using an array. Instead, if at all possible, arrays should be reused as much as possible. This will reduce allocations, thus reducing total memory use.

RowanHarris commented 6 years ago

Hi Jon,

The device you are using has 4GB, in my tests devices with large amounts of RAM had to significantly increase the allocation size.

As I mentioned above I tested on a device with 4GB and the problem happened when I set the size variable to 700.

Could you try using 700, and if that doesn't work use 800 or 900 etc. It will happen at some point just increase until it does.

I realize it is a ridiculously large size, however this is the simplest repeatable code I could create that shows the bug and gives similar strange behavior that we were seeing with our real application with thousands of much smaller allocations.

Thanks

Rowan

RowanHarris commented 6 years ago

Hi @jonpryor were you able to repeat the bug using the above instructions?

brendanzagaeski commented 6 years ago

Cross-referencing note to the Xamarin.Android team

This sounds quite similar to the old question from https://bugzilla.xamarin.com/show_bug.cgi?id=28958 about a technical limitation with non-precise stack scanning and false pinned objects. If this is indeed the same behavior as what was happening here, then manually managing the memory would be one way to avoid the behavior.

(Edit: That old issue does not discuss symptoms where the reported "los size" exceeds the available memory of the device, but perhaps that symptom is closely related?)

RowanHarris commented 6 years ago

Hi @brendanzagaeski, I think you are on to something!

The sample app I uploaded needed very large memory allocations to cause the problem. However in our real application it would happen with 5MB allocations. However those allocations were for buffers of memory that would be used by OpenGL, so I would allocate pinned memory which was different to what I did in the sample application.

In our real app it would need to resize this pinned buffer by first freeing the old buffer and allocating a new pinned buffer. However that freed memory sometimes would remain in the LOS. So it sounds like the same issue of false pinned memory, however it happens at lower memory allocation size when it was actually declared to be pinned.

So it could be something is causing the pinned memory to remain false pinned even when I call Free()

I will investigate if I can find the trigger that causes this bug to happen. I will also try the work around for the bug of manually managing the memory since.

Thanks for remembering the old case and giving me a new lead.

Was there any plan to fix the old bug? "The only way to properly fix this would be to implement precise stack scanning."

RowanHarris commented 6 years ago

the manually pinned memory or not did not seem to influence the problem from happening, it was just the same with pinned memory as it was without.

So I had to bite the bullet and remove all allocations larger than 5MB from xamarin managed memory and do my own memory management.

At least this problem is now resolved for me that the LOS is no longer false pinning memory and causing the application to crash. But only because I basically dont use the LOS since it can't be trusted to release the memory.

But really it needs to be resolved not just worked around. At the moment it is a land mine for many future people who can waste weeks of their time like I did trying to find why the memory is not being released.