Closed major-sanyi closed 1 month ago
Quick note: it seems to have already been an issue in the Ryochan7 fork. Will investigate further tomorrow.
Upon investigation with dotMemory it seems like GC runs periodically fine just the unmanaged memory does not decrease in my machine. Probably because I have 32GB, and half of it is unused. Will check what could be causing but the issue is not that severe that I first thought it.
It's been a while but I think the problem still exists; basically ReadWithFileStream
is used for reading input reports (which is invoked many times per second). In past profiler runs I noticed that internally this method allocates a stream wrapper each read that puts pressure on the GC. This can be fixed by just using APIs closer to the ReadFile
WinAPIs that directly write to a byte buffer, bypassing allocating stream objects altogether.
This is just one of many memory leaks we saw like ~2 or so years ago.
@nefarius could you please have a look if this rewritten method makes sense and there are no major mistakes in how the P/Invoke calls and OVERLAPPED
are used? (I've barely got any experience with native interop so I think it's better to ask someone that's been through this pain countless times :P)
https://github.com/schmaldeo/DS4Windows/blob/input-report-memory-leak/DS4Windows/HidLibrary/HidDevice.cs#L236
Using it seems to have basically gotten rid of that memory leak when profile editor is being opened, there's only a small amount allocated now every time it's opened and it's freed after some time:
The only thing that concerns me are those spikes in overall memory usage when GC runs, no idea where that would come from?
Sure, I'll take a look.
Well seems like it works fine except:
EDIT: the 2nd issue seems to be related to that CloseHandle
call in the finally
block
Need to figure out how I can upload the changes, GitHub Desktop is confused, I have too many DS4Windows forks it appears 😬
There we go #28 @schmaldeo pls test these changes before accepting, on this machine I currently do not have the prerequisites installed so consider them untested for now. I see a couple other things to address but that is not as important.
Spamming the profile editor makes it leak a lot of memory anyway (but opening it slowly multiple times actually works nicely)
This has been an issue since forever IIRC; spawning individual windows create and destroy so many objects it puts a lot of pressure in the GC on disposal.
Also throughout the entire app techniques that are outdated for the longest time are used that may add to it (pinning objects, forcing GC runs manually etc.).
There we go #28 @schmaldeo pls test these changes before accepting, on this machine I currently do not have the prerequisites installed so consider them untested for now. I see a couple other things to address but that is not as important.
It throws an AccessViolationException (similarly to what i was getting when i tried using NativeOverlapped
directly, therefore in my version i switched to using the managed Overlapped
with .Pack()
) after the PInvoke.ReadFile()
call.
Guess I'm not getting around getting it to run then so I can debug. Must be something silly.
It seems to be something about NativeOverlapped
in general, when i instantiate the managed Overlapped
with the same params and pack it like this:
var overlapped = new Overlapped(0, 0, wait.SafeWaitHandle.DangerousGetHandle(), null);
var packed = overlapped.Pack(null, null);
if (PInvoke.ReadFile(SafeReadHandle, inputBuffer, null, packed))
// NOTE: this will never hit when overlapped is used
return ReadStatus.Success;
// NOTE: if a timeout is required, use GetOverlappedResultEx instead
if (!PInvoke.GetOverlappedResult(SafeReadHandle, *packed, out var transferred, true))
return ReadStatus.ReadError;
it works just fine (but yeh idk what implications that really has, using the Native one might be better for all i know).
It's not supposed to fault so changing this only masquerades some issue we have not yet uncovered.
I compared what CsWin32 generated here with a different project that uses the same API successfully and there are significant differences, investigating, maybe the NuGet is too old.
Yeah something funky is going on with the source generation, the native definitions differ vastly:
Ooooh, dang, I get it now. The code I used as a template is from an outdated assembly using a 0.2.x version of CsWin32
so I assume the behavior and implementation practices have changed. Gonna figure this one out.
So the way the API works now is that it expects a NativeMemory
allocated NativeOverlapped
, I don't yet fully understand why that is necessary, however I will port over your code for now so we get unblocked on this.
So the way the API works now is that it expects a
NativeMemory
allocatedNativeOverlapped
, I don't yet fully understand why that is necessary, however I will port over your code for now so we get unblocked on this.
Seems like that's exactly what's happening when you call .Pack()
I've got a question though, in Remarks for this method in docs, it says:
The NativeOverlapped structure is fixed in physical memory until Unpack is called.
Will it not be necessary to unpack it every time before the method returns?
Will it not be necessary to unpack it every time before the method returns?
Correct, otherwise we just leave a copy in memory dangling around that will never get freed.
Hence my confusion on why this seemingly simple call suddenly got so overly complicated; in older CsWin32
and custom P/Invoke code you could simply pass the address to the stackheap-allocated struct to it and be done with it.
See this reference implementation that worked just fine ~1-2 years ago:
public virtual unsafe int ReadInputReport(Span<byte> buffer)
{
if (Handle.IsInvalid || Handle.IsClosed)
{
throw new HidDeviceException("Device handle not open or invalid.");
}
NativeOverlapped overlapped = new() { EventHandle = _readEvent.SafeWaitHandle.DangerousGetHandle() };
uint bytesRead = 0;
fixed (byte* bufferPtr = buffer)
{
BOOL ret = PInvoke.ReadFile(
Handle,
bufferPtr,
(uint)buffer.Length,
&bytesRead,
&overlapped
);
if (!ret && Marshal.GetLastWin32Error() != (uint)WIN32_ERROR.ERROR_IO_PENDING)
{
throw new HidDeviceException("Unexpected return result on ReadFile.");
}
if (!PInvoke.GetOverlappedResult(Handle, overlapped, out bytesRead, true))
{
throw new HidDeviceException("GetOverlappedResult on input report failed.");
}
}
return (int)bytesRead;
}
Looking at the call stack it appears that the native overlapped is accessed from a different thread; which is ofc. invalid and bombs since the addresses will not match:
Genuinely do not understand what's going on here, in C doing this is so easy 😅
I think I get what's going on here; that stack trace is definitely from a different thread and code path. I think the issue here is that we still mix the PInvoke and the stream usage on the same file handle. I can try to modernize the rest of that class another time, I am fairly confident that this unearthed a bug in parallel-running code.
Yep, refactoring WriteOutputReportViaInterrupt()
like so: https://github.com/schmaldeo/DS4Windows/blob/input-report-memory-leak/DS4Windows/HidLibrary/HidDevice.cs#L406
and not initialising the FileStream
in any DS4 codepath seems to have fixed the issue of not being able to directly use NativeOverlapped
.
Sadly I haven't got any other controllers so if I refactor the rest of Device classes to use P/Invoke for I/O I can't test if they work seamlessly.
Sadly I haven't got any other controllers so if I refactor the rest of Device classes to use P/Invoke for I/O I can't test if they work seamlessly.
That's what the users profiting from your free labour are for, let them spend 3 minutes of their day time to test and report 😉
Yep, refactoring
WriteOutputReportViaInterrupt()
like so: https://github.com/schmaldeo/DS4Windows/blob/input-report-memory-leak/DS4Windows/HidLibrary/HidDevice.cs#L406 and not initialising theFileStream
in any DS4 codepath seems to have fixed the issue of not being able to directly useNativeOverlapped
.
Nice, I knew I wasn't completely crazy 😅 ideally in the end the entire FileStream object is no longer in use and can be tossed entirely.
public unsafe ReadStatus ReadFile(Span<byte> inputBuffer, uint timeout = 0)
{
SafeReadHandle ??= OpenHandle(_devicePath, true, false);
using AutoResetEvent wait = new(false);
var ov = new NativeOverlapped { EventHandle = wait.SafeWaitHandle.DangerousGetHandle() };
// NOTE: this will never hit when overlapped is used
if (PInvoke.ReadFile(SafeReadHandle, inputBuffer, null, &ov))
return ReadStatus.Success;
uint transferred;
if (timeout != 0)
{
if (!PInvoke.GetOverlappedResultEx(SafeReadHandle, ov, out transferred, timeout, true))
return ReadStatus.ReadError;
}
else
{
if (!PInvoke.GetOverlappedResult(SafeReadHandle, ov, out transferred, true))
return ReadStatus.ReadError;
}
// this should never happen
if (transferred > inputBuffer.Length)
throw new InvalidOperationException("We read more than the buffer can hold.");
return ReadStatus.Success;
}
Does this look good or is there a way to only use GetOverlappedResultEx
?
You can soley use GetOverlappedResultEx
by setting the timeout parameter to INFINITE
to have it only return when the call has finished.
Also the remark // NOTE: this will never hit when overlapped is used
I made isn't quite true and these methods can be further improved; if PInvoke.ReadFile
returns 0/FALSE and Marshal.GetLastWin32Error
is ERROR_IO_PENDING
only then you actually need to call the wait functions, like GetOverlappedResultEx
.
If ReadFile
immediately returns with success despite overlapped I/O being used, there is no need to wait on it since there is no pending I/0.
A lot of these pitfalls are "hidden" in the Remakrs sections of these Win32 functions documentation.
I read about setting the timeout, but I don't understand how to do it when the type of that parameter is uint?
I read about setting the timeout, but I don't understand how to do it when the type of that parameter is uint?
By giving it
#define INFINITE 0xFFFFFFFF // Infinite timeout
Which in C# should also exist pre-defined; UInt32.MaxValue
IIRC? Or let CsWin32
generate the macro definition.
Yes uint.MaxValue worked, when i tested it yesterday it would never remove the controller once it was disconnected but it was probably some stupid error on my side 😀
Describe the bug It seems like opening the profile editor uses some extra memory, that's not freed upon closing the profile window. Stumbled on by accident.
To Reproduce Steps to reproduce the behavior:
Expected behavior Freeing up some memory overtime, not allocating it indefinetly.
Screenshots and Logs
Desktop (please complete the following information):
Additional context Add any other context about the problem here.