WPF control is not reclaimed on NET9 by GC

minaew commented 4 months ago

Description

GC does not reclaim control memory in following scenario. The control is placed in the visual tree and then removed from it. Conditions must be met:

PresentationTraceSources.Refresh() call
there is a subscriber to the Unloaded event of control

Reproduction Steps

Run test TestLeak in project targeting net9.0-windows. NUnit version is 3.12.0.

using System;
using System.Diagnostics;
using System.Windows;
using System.Windows.Controls;
using System.Windows.Threading;
using NUnit.Framework;

namespace Sample {
    [TestFixture]
    public class MemoryTests {
        [Test]
        public void TestLeak() {
            PresentationTraceSources.Refresh();
            var window = new Window();
            var wr = AddAndRemoveImage(window);
            GC.Collect();
            GC.WaitForPendingFinalizers();
            Assert.IsFalse(wr.IsAlive);
        }

        static WeakReference AddAndRemoveImage(Window window) {
            var img = new Image();
            var wr = new WeakReference(img);
            var holder = new ImageHolder(img);
            window.Content = img;
            window.Show();
            DoEvents();
            window.Content = null;
            DoEvents();
            return wr;
        }

        static void DoEvents() {
            var frame = new DispatcherFrame();
            Dispatcher.CurrentDispatcher.InvokeAsync(() => frame.Continue = false, DispatcherPriority.Background);
            Dispatcher.PushFrame(frame);
        }

        class ImageHolder {
            readonly Image image;
            public ImageHolder(Image image) {
                this.image = image;
                this.image.Loaded += OnLoaded;
                this.image.Unloaded += OnUnloaded;
            }

            void OnLoaded(object sender, RoutedEventArgs e) {
                Debug.WriteLine("OnLoaded");
            }

            void OnUnloaded(object sender, RoutedEventArgs e) {
                Debug.WriteLine("OnUnloaded");
            }
        }
    }
}

Expected behavior

The test passes.

Actual behavior

The test fails.

Regression?

This works in net8.0 and net462.

Known Workarounds

No response

Configuration

.NET: 9.0.0-preview.6.24327.6 OS: Windows 10, Version 22H2 (OS Build 19045.4651) Architecture: x64

Other information

No response

teo-tsirpanis commented 4 months ago

Can you add [MethodImpl(MethodImplOptions.NoInlining)] in AddAndRemoveImage and try again? There's a chance the JIT has inlined AddAndRemoveImage, which might have extended the lifetime of img.

EgorBo commented 4 months ago

Can you add [MethodImpl(MethodImplOptions.NoInlining)] in AddAndRemoveImage and try again? There's a chance the JIT has inlined AddAndRemoveImage, which might have extended the lifetime of img.

if it's inlined, then it means it's Tier1/Fullopts, hence, precise liveness is expected, so should also be claimed. I quickly tried the repro locally and it seems like there is indeed a change in behaviour between net8.0-windows and net9.0-windows for both Tier0 and Tier1, so shouldn't be JIT's fault.

minaew commented 4 months ago

Can you add [MethodImpl(MethodImplOptions.NoInlining)] in AddAndRemoveImage and try again? There's a chance the JIT has inlined AddAndRemoveImage, which might have extended the lifetime of img.

Thanks. I tried, still no success.

EgorBo commented 4 months ago

Might be the same issue as https://github.com/dotnet/runtime/issues/104218 cc @VSadov

minaew commented 4 months ago

The RetryAttribute helps. Less tries is required if test closes the window.

teo-tsirpanis commented 4 months ago

I would also suggest running GC and waiting for pending finalizers in your tests more than once. The object might not be released on the first try.

minaew commented 3 months ago

I would also suggest running GC and waiting for pending finalizers in your tests more than once. The object might not be released on the first try.

This also does not help, event increases the number of required retries.

minaew commented 3 months ago

By the way, if I hide window (window.Hide()) the test becomes much more stable, but still not 100%.

mangod9 commented 3 months ago

Can you add [MethodImpl(MethodImplOptions.NoInlining)] in AddAndRemoveImage and try again? There's a chance the JIT has inlined AddAndRemoveImage, which might have extended the lifetime of img.

if it's inlined, then it means it's Tier1/Fullopts, hence, precise liveness is expected, so should also be claimed. I quickly tried the repro locally and it seems like there is indeed a change in behaviour between net8.0-windows and net9.0-windows for both Tier0 and Tier1, so shouldn't be JIT's fault.

@EgorBo, can you please clarify what you mean by behavior change here?

EgorBo commented 3 months ago

Can you add [MethodImpl(MethodImplOptions.NoInlining)] in AddAndRemoveImage and try again? There's a chance the JIT has inlined AddAndRemoveImage, which might have extended the lifetime of img.

if it's inlined, then it means it's Tier1/Fullopts, hence, precise liveness is expected, so should also be claimed. I quickly tried the repro locally and it seems like there is indeed a change in behaviour between net8.0-windows and net9.0-windows for both Tier0 and Tier1, so shouldn't be JIT's fault.

@EgorBo, can you please clarify what you mean by behavior change here?

            GC.Collect();
            GC.WaitForPendingFinalizers();

do not longer call the finalizer right away in .NET 9.0 while it's consistently calling it in .NET 8.0. Bot for optimized codegen and unoptimized. I know it's not supposed to be 100% guaranteed, but it's just what the author is complaining

mangod9 commented 3 months ago

Ok thanks. Adding @jkotas @VSadov as well, possibly related to moving finalizer loop to managed?

jkotas commented 3 months ago

possibly related to moving finalizer loop to managed?

It is very unlikely to be related to that change.

I am not able to reproduce the issue. Could you please create a scratch project on github that can be cloned and run to reproduce it?

minaew commented 3 months ago

possibly related to moving finalizer loop to managed?

It is very unlikely to be related to that change.

I am not able to reproduce the issue. Could you please create a scratch project on github that can be cloned and run to reproduce it?

net9-gc-issues.zip

minaew commented 3 months ago

possibly related to moving finalizer loop to managed?

It is very unlikely to be related to that change.

I am not able to reproduce the issue. Could you please create a scratch project on github that can be cloned and run to reproduce it?

https://github.com/minaew/net9-gc-issues