nefarius / ViGEmBus

Windows kernel-mode driver emulating well-known USB game controllers.
https://docs.nefarius.at/projects/ViGEm/
BSD 3-Clause "New" or "Revised" License
3.08k stars 285 forks source link

Vibration persists after being set to 0 #68

Closed Filoppi closed 1 year ago

Filoppi commented 3 years ago

Describe the bug Two subsequent calls to XInputSetState with different values break the state of the motors (vibration persists after being reset to 0 on both motors). It can easily be tested with this configuration on Dolphin: Bug This happens on a DS4 with DS4Windows, though looking at DS4Windows, no XInput code seems to be there so I thought it was probably ViGEmBus? Sorry if I'm wrong. The bug does not happen on a real Xbox controller. I've also wrote some code that triggers the bug and pasted the pseudo code for you to reproduce it below. When setting "should_sleep" to true, the state won't break.

To Reproduce Steps to reproduce the behavior (example):

static bool should_sleep = false;
static int sleep_time = 100;
state_out.wLeftMotorSpeed = 0;
state_out.wRightMotorSpeed= 65535;
XInputSetState(i, &state_out););
state_out.wLeftMotorSpeed = 65535;
XInputSetState(i, &state_out););
if (should_sleep)
  Sleep(sleep_time);
state_out.wRightMotorSpeed= 0;
XInputSetState(i, &state_out);
state_out.wLeftMotorSpeed = 0;
XInputSetState(i, &state_out);

Expected behavior Vibration goes back to resting state.

System details (please complete the following information):

Additional context Add any other context about the problem here.

Filoppi commented 3 years ago

Nevermind, I've manually updated ViGEm to 1.17 and it fixed the problem. Keeping the bug open for reference. I thought DS4 would update the driver as well so I didn't realize I was running an old version.

nefarius commented 3 years ago

Yep, that race/timing condition was a huge problem with the previous design of how driver and the client SDK handled these packets, this has been fixed for good in 1.17 😁

mika-n commented 3 years ago

@nefarius Continuing the discussion here about "stuck rumbling motor" issue. Here is a link to the latest finding in DS4Windows where this issue is discussed. Interesting observation about how rumble feedback event handler was called in DS4Windows app by the ViGem .NET/native ViGem DLL library.

I was able to duplicate the "stuck rumble motor" issue with ViGem 1.17.333 kernel driver and the new Nefarious.ViGem.Client.dll SDK DLL library. At this point it is still unknown why this happens and why the "Rumble = 0" event is sometimes lost. Or is stuck in some queue and when the next rumble event is sent by a game/client app then the "the lost" feedback handler msg is flushed and sent away. https://github.com/Ryochan7/DS4Windows/issues/1730#issuecomment-756240299

This test was run using xbox360 ViGem output driver in DS4Windows, Dolphin emulator, Controller configuration screen in Dolphin, Test Rumble button there (using "Motor L | Motor R" rumbling, ie both motors enabled in Dolphin using OR operator).

DS4Windows: 2.2.1 ViGem kernel driver: 1.17.333 Nefarius.ViGEm.Client.dll: (the latest sdk published with 1.17.333 driver version)

Anyway, this was just a "pump up" message here and the investigations continue to find out if the bug is in DS4Windows app itself or somewhere in the ViGem communication line (.NET binding? SDK DLL? Kernel driver?). I hope the bug is in DS4Windows app because then it will be "easy" to fix when the root cause is found. I pray it is not in the new kernel driver level.

Filoppi commented 3 years ago

Note that while the input mapping window is open, depending on your current view dolphin spams updates to "Motor L | Motor R" with a value of 0. For the rest, the bug doesn't happen anymore on my machine.

nefarius commented 3 years ago

@mika-n which binaries for dotnet do you use and from where exactly? Built from sources? Which branch, tag etc.? I haven't merged the changes to dotnet master branch yet so it's crucial to know what and how exactly you test.

mika-n commented 3 years ago

I need to wait for a comment from Ryochan7 because he did those 1.17.x SDK changes, but I believe DS4Windows 2.2.x version uses this branch of ViGem DotNet SDK interface. https://github.com/Ryochan7/ViGEm.NET/tree/ryochan_v1.17-experimental

Now I see it is few commits behind the master Vigem sdk branch. I need to wait until Ryochan7 has had time to experiment with this bug also. Anyway, I was able to trigger the "stuck rumble" bug by doing following:

Sometimes this resulted the "stuck rumble" issue as explained in the linked DS4Windows ticket and debug messages showing that "Large=0 Small=0" message (left/right motor values sent by Dolphin) was sometimes lost (ie. never received until some queue was flushed and the lost message was then received) in DS4Windows feedback event handler.

But, Ryochan7 knows more about this Vigem SDK dotnet library in DS4Windows, so I'm waiting for his comments/tests.

I did a quick test with DS4Windows using the older version of Nefarious.Vigem.client.dll library file and could not trigger the bug with that client version. But, here I'm waiting for comments from users if they could test the same with real games and get back with their test results. But this is promising because this would indicate that the bug is not in ViGEm kernel driver level but in the client dotnet library version used by DS4Windows (and that issue could be solved simply by refreshing the client DLL up-to-date).

ds4win_RumbleTestWithDolphin

nefarius commented 3 years ago

Thanks for the summary, sounds pretty much the same I did the testing with a simple tester app, but the native C/C++ lib. I'll cook up a simple .NET console app with the latest NuGet from AppVeyor and see if it misbehaves or as instructed 😉

nefarius commented 3 years ago

Here's the test with a dead simple console app using v1.17.x of the C library (and v1.17.x driver):

image

Es expected, the last pair of zeros signalling both motors off always arrives last as it is expected.

Will later test the same with .NET.

Cheers

EDIT: this is the simple callback function with a basic logging snippet in it, no locks or anything fancy, the driver takes care of returning outstanding packets to the client SDK in the exact order it arrived, including a cache/queue to eliminate time sensitive/racing issues:

VOID CALLBACK notifyX360(
    PVIGEM_CLIENT Client,
    PVIGEM_TARGET Target,
    UCHAR LargeMotor,
    UCHAR SmallMotor,
    UCHAR LedNumber,
    LPVOID UserData
)
{
    static int count = 1;

    std::cout << "[" << std::setfill('0') << std::setw(8) << count++ << "] "
        << "L: " << std::setfill('0') << std::setw(3) << (int)LargeMotor << " "
        << "S: " << std::setfill('0') << std::setw(3) << (int)SmallMotor << " "
        << "LED: " << (int)LedNumber << std::endl;
}
nefarius commented 3 years ago

Here's the same test working as expected with the latest (master, v1.17.x) .NET library:

image

The code:

using System;
using Nefarius.ViGEm.Client;

namespace ViGem_v17Test
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            var counter = 0;

            try
            {
                var client = new ViGEmClient();

                var x360 = client.CreateXbox360Controller();

                x360.Connect();

                x360.FeedbackReceived += (sender, eventArgs) =>
                {
                    Console.WriteLine(
                        $"[{counter++:0000}] - L: {eventArgs.LargeMotor:0000} S: {eventArgs.SmallMotor:0000}");
                };

                Console.ReadKey();
            }
            catch (Exception ex)
            {
                Console.WriteLine(ex);
            }
        }
    }
}
nefarius commented 3 years ago

Another "extreme" example with a Thread.Sleep(1000); in the event handler to simulate slow feedback processing:

image

No event is lost, it just arrives later in the sequential order it was put on the (virtual) wire, as the feeder application would expect.

mika-n commented 3 years ago

I suspect that there are some problems in the .NET interface used in DS4Windows app (it has some custom changes). Older version of the .NET vigem library works without feedback events lost, but the new one sometimes leaves events hanging in some buffer. Don't know why it does that, but it seems that the issue is not in ViGem and in the official ViGem SDK library but in the custom one used in DS4Windows app.

nefarius commented 3 years ago

Might be, I provided the tests and results just as a token that I've not gone insane and it actually does what I claim 😅

mika-n commented 3 years ago

@nefarius Just a quick update to this "rumble motors keep on rumbling" issue. Ryochan7 made a test version out of the latest "vanilla" version of ViGen client DLL library. We (me, Ryochan7 and several people using DS4Windows app in xbox360 or dualshock4 output mode) have been running tests using the following ViGem DLL library where the old logic of using ring-queue to make sure there are always X number of pending overlapped IO calls was restored back in place: https://gist.github.com/Ryochan7/f41bd345c5592d88f37ff24773e1e882

Several people have reported that after this change in Vigem client DLL library they haven't experienced those weird rumble issue while using DS4Windows+ViGem+game or emulator app. I don't know what is so special about Dolphin/Cemu/PCSX apps, but it looks like the issue happens there more often than in many other apps. However, some people have experienced the same issue in certain native PC games also.

Here is the link to the discussion thread. https://github.com/Ryochan7/DS4Windows/issues/1730#issuecomment-756240299

This is really really weird thing and difficult to say exactly where the issue is (DS4Windows app? those emulator apps? client dll?), but the linked diff to ViGem client DLL library seem to be at the moment solution to this problem. Tests continue....

nefarius commented 3 years ago

Dumb question, are you sure that you test with the latest 1.17 dotnet lib and native 1.17 lib embedded into it? Not gonna discredit your efforts but it happened quite a few time to myself accidentally running the wrong build.

So in essence what your post is saying: 1.16 receives rumble always in correct order, 1.17 can happen out of order again, yes?

nefarius commented 3 years ago

We could also have a chat on Discord about it.

mika-n commented 3 years ago

Dumb question, are you sure that you test with the latest 1.17 dotnet lib and native 1.17 lib embedded into it?

Hmm.. I believe DS4Windows nowadays uses the latest .NET bindings and the C++ Vigem dll client and 1.17.333 driver because DS4Windows supports virtual touchpad and gyro events and those were not supported in the previous ViGem interface. But I'll dig out the link to the current .NET github branch used by DS4Windows app (need to ask from Ryochan7).

So in essence what your post is saying: 1.16 receives rumble always in correct order, 1.17 can happen out of order again, yes?

While debugging this thing it was found out that the older 1.16 Nefarius.Vigem.Client.dll library didn't suffer from the "persistent rumble" issue (and 1.16 dll library was still using the ring-queue of overlapped IO calls to wait for feedback events), but the new 1.17 client.,dll library suffered from this issue.

Then it was decided that it may be a good idea to test the new Vigem client interface with the overlapped IO logic and Ryochan7 did that diff patch and applied it to the 1.17.x ViGem client library. After testing it few weeks it looks like that the persistent rumble effect is now gone.

The problem with 1.17 was not exactly wrong order. The order of messages was always correct, but it looked like that sometimes messages were delayed or stuck in some waiting queue before the feedback msg was forwarded from the client DLL library (or from Vigem driver) down to the event handler listening these events.

The following post captured this scenario (I had added various debug messages to the feedback event handler). In the log dump at "01/07/2021 18:31:26" the bug finally popped up (test was done using "Test Rumble" button clicks in Dolphin controller configuration screen). The rumble got stuck because "Large=0" message didn't come in until 5 minutes later when the "Test Rumble" button was pressed again in Dolphin. Almost like the event at "01/07/2021 18:35:05: Large=0 Small=0" was hanging in some waiting queue. It was posted in correct order, but the event came in only when the next "Test Rumble" button was pressed and wanted to use a rumble again. https://github.com/Ryochan7/DS4Windows/issues/1730#issuecomment-756240299

A bit difficult to explain, but I hope the log messages and comments in the linked post clarifies this better.

nefarius commented 3 years ago

Thanks, I've read it.

The v1.16 way of thing works also with the 1.17 driver because I took care of being both forward and backwards compatible.

The v.1.16 way is messy and a result of bad design (back in the day I didn't know any better) and should not be used anymore.

The changes of v1.17 driver and SDK summarised: the driver now keeps a copy of the rumble packets in memory in the order it arrived and can get picked up by the SDK with a generous amount of delay (as demonstrated above). Please run my example code yourself, it should work reliably as advertised.

So with v1.17 the amount of ridiculously overengineered code in user-land has been drastically removed and the logic to report back the packets in the right order and without any losses is now the drivers job. To my understanding as of now it is impossible with the current design to have events "lost", "out of order", or "arriving later", you can verify by looking at the code, the .NET wrapper simply calls the native API and fires an event if a packet has been read and continues to do so until the buffer cache is empty.

I can take another look at the new driver logic but in my opinion with only one pair of eyes looking at it I can see no problem there 🤔

nefarius commented 3 years ago

I see if I can simplify the driver more, less code, less grounds for issues 😉

mika-n commented 3 years ago

I will do tests with the sample you provided using Dolphin emulator (testing with the "stock vanilla" version of Vigem client dll library and .NET binding and the tweaked version).

nefarius commented 3 years ago

I will do tests with the sample you provided using Dolphin emulator (testing with the "stock vanilla" version of Vigem client dll library and .NET binding and the tweaked version).

Sweet! Ping me if you have any questions or discoveries ✌️

Ryochan7 commented 3 years ago

This issue has been rectified in the current version of DS4Windows (2.2.7). Although, the fix did require adding a notification queue back to the C++ ViGEmClient code.

The main issue seems to stem from a feedback event being signaled while the client program code is not currently waiting for feedback data via GetOverlappedResult. The prior DeviceIoControl call always seems to return status code 997 (ERROR_IO_PENDING) even if data is technically available in the backend buffer in ViGEmBus. As a result, feedback data is one request behind until another feedback event is sent through. No feedback event data is lost in the process.

When testing the vanilla 1.17 code, the problem was much harder to reproduce than with the 1.16 client. The issue posted on the DS4Windows issue tracker was listed as works for me for a while until I performed my old RetroArch test and was finally able to reproduce the problem. Used the ParaLLEl N64 core and played Legend of Zelda: Ocarina of Time. Bashing Link's head against a fence caused the problem to surface after about 2 minutes. With the notification queue added back in, doing the same thing for 5+ minutes worked fine.

Ryochan7 commented 3 years ago

The notification queue change is still only available as a diff. I probably should get a test branch posted in a fork to make inspecting the change easier.

https://gist.github.com/Ryochan7/f41bd345c5592d88f37ff24773e1e882

Update: Test branch

https://github.com/Ryochan7/ViGEmClient/tree/notification_queue

nefarius commented 3 years ago

The main issue seems to stem from a feedback event being signaled while the client program code is not currently waiting for feedback data via GetOverlappedResult. The prior DeviceIoControl call always seems to return status code 997 (ERROR_IO_PENDING) even if data is technically available in the backend buffer in ViGEmBus. As a result, feedback data is one request behind until another feedback event is sent through. No feedback event data is lost in the process.

Hm, according to MS Docs it is always expected of DeviceIoControl to immediately return with ERROR_IO_PENDING when using the overlapped/async approach. To my understanding it should in no way be possible to have an unexpected state between this call and waiting for completion with the following call to GetOverlappedResult since the event in the overlapped struct ties them together.

To be perfectly honest, I am still not satisfied about the complexity of the notification mechanism, and I'd rather not go back to the user land queueing mechanism because although it seems to work around the issue reliability it's basically just hiding bad design.

I have a private branch with loads of changes in regards to other long standing issues but this branch is definitely the kill switch for anything older than Windows 10 so I held back until recently. It may be beneficial for both our projects to just throw the past over board and use newer frameworks which reduce the bloat within the aged driver.

nefarius commented 3 years ago

The notification queue change is still only available as a diff. I probably should get a test branch posted in a fork to make inspecting the change easier.

https://gist.github.com/Ryochan7/f41bd345c5592d88f37ff24773e1e882

Update: Test branch

https://github.com/Ryochan7/ViGEmClient/tree/notification_queue

Good to know both worlds still can coexist in harmony, means my efforts to maintain backwards compatibility were somewhat fruitful 😅

Next plan for me is also to get rid of C++ features in the client SDK, it was introduced out of convenience and drags dependencies into other projects not needed. Also wanna get rid of the async usage with the notification code, basic WinAPI concepts like completion ports should do the same work with less code and complexity required.

nefarius commented 1 year ago

Closing as much has changed regarding rumble feedback so whoever's affected should check the latest driver and SDK builds.