schmaldeo / DS4Windows

DS4Windows but maintained
GNU General Public License v3.0
113 stars 8 forks source link

[BUG] Possible memory leak #15

Closed major-sanyi closed 1 month ago

major-sanyi commented 1 month ago

Describe the bug It seems like opening the profile editor uses some extra memory, that's not freed upon closing the profile window. Stumbled on by accident.

To Reproduce Steps to reproduce the behavior:

  1. Open profile editor on any profile. (I used default)
  2. Close it
  3. Open it again
  4. Repeat

Expected behavior Freeing up some memory overtime, not allocating it indefinetly.

Screenshots and Logs image

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

schmaldeo commented 1 month ago

Quick note: it seems to have already been an issue in the Ryochan7 fork. Will investigate further tomorrow.

major-sanyi commented 1 month ago

image

Upon investigation with dotMemory it seems like GC runs periodically fine just the unmanaged memory does not decrease in my machine. Probably because I have 32GB, and half of it is unused. Will check what could be causing but the issue is not that severe that I first thought it.

nefarius commented 1 month ago

It's been a while but I think the problem still exists; basically ReadWithFileStream is used for reading input reports (which is invoked many times per second). In past profiler runs I noticed that internally this method allocates a stream wrapper each read that puts pressure on the GC. This can be fixed by just using APIs closer to the ReadFile WinAPIs that directly write to a byte buffer, bypassing allocating stream objects altogether.

This is just one of many memory leaks we saw like ~2 or so years ago.

schmaldeo commented 1 month ago

@nefarius could you please have a look if this rewritten method makes sense and there are no major mistakes in how the P/Invoke calls and OVERLAPPED are used? (I've barely got any experience with native interop so I think it's better to ask someone that's been through this pain countless times :P) https://github.com/schmaldeo/DS4Windows/blob/input-report-memory-leak/DS4Windows/HidLibrary/HidDevice.cs#L236

Using it seems to have basically gotten rid of that memory leak when profile editor is being opened, there's only a small amount allocated now every time it's opened and it's freed after some time:

The only thing that concerns me are those spikes in overall memory usage when GC runs, no idea where that would come from?

nefarius commented 1 month ago

Sure, I'll take a look.

schmaldeo commented 1 month ago

Well seems like it works fine except:

EDIT: the 2nd issue seems to be related to that CloseHandle call in the finally block

nefarius commented 1 month ago

Need to figure out how I can upload the changes, GitHub Desktop is confused, I have too many DS4Windows forks it appears 😬

nefarius commented 1 month ago

There we go #28 @schmaldeo pls test these changes before accepting, on this machine I currently do not have the prerequisites installed so consider them untested for now. I see a couple other things to address but that is not as important.

nefarius commented 1 month ago

Spamming the profile editor makes it leak a lot of memory anyway (but opening it slowly multiple times actually works nicely)

This has been an issue since forever IIRC; spawning individual windows create and destroy so many objects it puts a lot of pressure in the GC on disposal.

Also throughout the entire app techniques that are outdated for the longest time are used that may add to it (pinning objects, forcing GC runs manually etc.).

schmaldeo commented 1 month ago

There we go #28 @schmaldeo pls test these changes before accepting, on this machine I currently do not have the prerequisites installed so consider them untested for now. I see a couple other things to address but that is not as important.

It throws an AccessViolationException (similarly to what i was getting when i tried using NativeOverlapped directly, therefore in my version i switched to using the managed Overlapped with .Pack()) after the PInvoke.ReadFile() call.

nefarius commented 1 month ago

Guess I'm not getting around getting it to run then so I can debug. Must be something silly.

schmaldeo commented 1 month ago

It seems to be something about NativeOverlapped in general, when i instantiate the managed Overlapped with the same params and pack it like this:

var overlapped = new Overlapped(0, 0, wait.SafeWaitHandle.DangerousGetHandle(), null);
var packed = overlapped.Pack(null, null);
if (PInvoke.ReadFile(SafeReadHandle, inputBuffer, null, packed))
    // NOTE: this will never hit when overlapped is used
    return ReadStatus.Success;

// NOTE: if a timeout is required, use GetOverlappedResultEx instead
if (!PInvoke.GetOverlappedResult(SafeReadHandle, *packed, out var transferred, true))
    return ReadStatus.ReadError;

it works just fine (but yeh idk what implications that really has, using the Native one might be better for all i know).

nefarius commented 1 month ago

It's not supposed to fault so changing this only masquerades some issue we have not yet uncovered.

nefarius commented 1 month ago

I compared what CsWin32 generated here with a different project that uses the same API successfully and there are significant differences, investigating, maybe the NuGet is too old.

nefarius commented 1 month ago

Yeah something funky is going on with the source generation, the native definitions differ vastly:

Works

image

Bombs

image

nefarius commented 1 month ago

Ooooh, dang, I get it now. The code I used as a template is from an outdated assembly using a 0.2.x version of CsWin32 so I assume the behavior and implementation practices have changed. Gonna figure this one out.

nefarius commented 1 month ago

So the way the API works now is that it expects a NativeMemory allocated NativeOverlapped, I don't yet fully understand why that is necessary, however I will port over your code for now so we get unblocked on this.

schmaldeo commented 1 month ago

So the way the API works now is that it expects a NativeMemory allocated NativeOverlapped, I don't yet fully understand why that is necessary, however I will port over your code for now so we get unblocked on this.

Seems like that's exactly what's happening when you call .Pack()

I've got a question though, in Remarks for this method in docs, it says:

The NativeOverlapped structure is fixed in physical memory until Unpack is called.

Will it not be necessary to unpack it every time before the method returns?

nefarius commented 1 month ago

Will it not be necessary to unpack it every time before the method returns?

Correct, otherwise we just leave a copy in memory dangling around that will never get freed.

Hence my confusion on why this seemingly simple call suddenly got so overly complicated; in older CsWin32 and custom P/Invoke code you could simply pass the address to the stackheap-allocated struct to it and be done with it.

See this reference implementation that worked just fine ~1-2 years ago:

    public virtual unsafe int ReadInputReport(Span<byte> buffer)
    {
        if (Handle.IsInvalid || Handle.IsClosed)
        {
            throw new HidDeviceException("Device handle not open or invalid.");
        }

        NativeOverlapped overlapped = new() { EventHandle = _readEvent.SafeWaitHandle.DangerousGetHandle() };

        uint bytesRead = 0;
        fixed (byte* bufferPtr = buffer)
        {
            BOOL ret = PInvoke.ReadFile(
                Handle,
                bufferPtr,
                (uint)buffer.Length,
                &bytesRead,
                &overlapped
            );

            if (!ret && Marshal.GetLastWin32Error() != (uint)WIN32_ERROR.ERROR_IO_PENDING)
            {
                throw new HidDeviceException("Unexpected return result on ReadFile.");
            }

            if (!PInvoke.GetOverlappedResult(Handle, overlapped, out bytesRead, true))
            {
                throw new HidDeviceException("GetOverlappedResult on input report failed.");
            }
        }

        return (int)bytesRead;
    }
nefarius commented 1 month ago

Looking at the call stack it appears that the native overlapped is accessed from a different thread; which is ofc. invalid and bombs since the addresses will not match:

image

Genuinely do not understand what's going on here, in C doing this is so easy 😅

nefarius commented 1 month ago

I think I get what's going on here; that stack trace is definitely from a different thread and code path. I think the issue here is that we still mix the PInvoke and the stream usage on the same file handle. I can try to modernize the rest of that class another time, I am fairly confident that this unearthed a bug in parallel-running code.

schmaldeo commented 1 month ago

Yep, refactoring WriteOutputReportViaInterrupt() like so: https://github.com/schmaldeo/DS4Windows/blob/input-report-memory-leak/DS4Windows/HidLibrary/HidDevice.cs#L406 and not initialising the FileStream in any DS4 codepath seems to have fixed the issue of not being able to directly use NativeOverlapped.

Sadly I haven't got any other controllers so if I refactor the rest of Device classes to use P/Invoke for I/O I can't test if they work seamlessly.

nefarius commented 1 month ago

Sadly I haven't got any other controllers so if I refactor the rest of Device classes to use P/Invoke for I/O I can't test if they work seamlessly.

That's what the users profiting from your free labour are for, let them spend 3 minutes of their day time to test and report 😉

nefarius commented 1 month ago

Yep, refactoring WriteOutputReportViaInterrupt() like so: https://github.com/schmaldeo/DS4Windows/blob/input-report-memory-leak/DS4Windows/HidLibrary/HidDevice.cs#L406 and not initialising the FileStream in any DS4 codepath seems to have fixed the issue of not being able to directly use NativeOverlapped.

Nice, I knew I wasn't completely crazy 😅 ideally in the end the entire FileStream object is no longer in use and can be tossed entirely.

schmaldeo commented 1 month ago
public unsafe ReadStatus ReadFile(Span<byte> inputBuffer, uint timeout = 0)
{
    SafeReadHandle ??= OpenHandle(_devicePath, true, false);

    using AutoResetEvent wait = new(false);

    var ov = new NativeOverlapped { EventHandle = wait.SafeWaitHandle.DangerousGetHandle() };

    // NOTE: this will never hit when overlapped is used
    if (PInvoke.ReadFile(SafeReadHandle, inputBuffer, null, &ov))
        return ReadStatus.Success;

    uint transferred;
    if (timeout != 0)
    {
        if (!PInvoke.GetOverlappedResultEx(SafeReadHandle, ov, out transferred, timeout, true))
            return ReadStatus.ReadError;
    }
    else
    {
        if (!PInvoke.GetOverlappedResult(SafeReadHandle, ov, out transferred, true))
            return ReadStatus.ReadError;
    }

    // this should never happen
    if (transferred > inputBuffer.Length)
        throw new InvalidOperationException("We read more than the buffer can hold.");

    return ReadStatus.Success;
}

Does this look good or is there a way to only use GetOverlappedResultEx?

nefarius commented 1 month ago

You can soley use GetOverlappedResultEx by setting the timeout parameter to INFINITE to have it only return when the call has finished.

Also the remark // NOTE: this will never hit when overlapped is used I made isn't quite true and these methods can be further improved; if PInvoke.ReadFile returns 0/FALSE and Marshal.GetLastWin32Error is ERROR_IO_PENDING only then you actually need to call the wait functions, like GetOverlappedResultEx.

If ReadFile immediately returns with success despite overlapped I/O being used, there is no need to wait on it since there is no pending I/0.

A lot of these pitfalls are "hidden" in the Remakrs sections of these Win32 functions documentation.

schmaldeo commented 1 month ago

I read about setting the timeout, but I don't understand how to do it when the type of that parameter is uint?

nefarius commented 1 month ago

I read about setting the timeout, but I don't understand how to do it when the type of that parameter is uint?

By giving it

#define INFINITE            0xFFFFFFFF  // Infinite timeout

Which in C# should also exist pre-defined; UInt32.MaxValue IIRC? Or let CsWin32 generate the macro definition.

schmaldeo commented 1 month ago

Yes uint.MaxValue worked, when i tested it yesterday it would never remove the controller once it was disconnected but it was probably some stupid error on my side 😀

schmaldeo commented 1 month ago

In beta