libsdl-org / SDL

Simple Directmedia Layer
https://libsdl.org
zlib License
9.91k stars 1.83k forks source link

Switch Pro controller hangs in SDL_InitJoysticks in SDL3 #8843

Open Themaister opened 9 months ago

Themaister commented 9 months ago

To repro:

Connect a Switch Pro Controller on Arch Linux.

git clone https://github.com/Themaister/Granite
cd Granite
git submodule update --init
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug -G Ninja
ninja latency-test
./tests/latency-test
#0  0x00007ffff7afbf6f in poll () from /usr/lib/libc.so.6
#1  0x00005555561e1043 in PLATFORM_hid_read_timeout (dev=0x555556c84660, data=0x555556c6a30d "0\032Q", length=64, milliseconds=0) at /tmp/Granite/third_party/sdl3/src/hidapi/linux/hid.c:1231
#2  0x00005555561e61f2 in SDL_hid_read_timeout_REAL (device=0x555556c38490, data=0x555556c6a30d "0\032Q", length=64, milliseconds=0) at /tmp/Granite/third_party/sdl3/src/hidapi/SDL_hidapi.c:1590
#3  0x000055555638ff84 in ReadInput (ctx=0x555556c6a2d0) at /tmp/Granite/third_party/sdl3/src/joystick/hidapi/SDL_hidapi_switch.c:333
#4  0x00005555563900cc in ReadProprietaryReply (ctx=0x555556c6a2d0, expectedID=k_eSwitchProprietaryCommandIDs_Status) at /tmp/Granite/third_party/sdl3/src/joystick/hidapi/SDL_hidapi_switch.c:380
#5  0x000055555639043d in WriteProprietary (ctx=0x555556c6a2d0, ucCommand=k_eSwitchProprietaryCommandIDs_Status, pBuf=0x0, ucLen=0 '\000', waitForReply=1)
    at /tmp/Granite/third_party/sdl3/src/joystick/hidapi/SDL_hidapi_switch.c:479
#6  0x0000555556391d49 in ReadJoyConControllerType (device=0x555556c69d60) at /tmp/Granite/third_party/sdl3/src/joystick/hidapi/SDL_hidapi_switch.c:984
#7  0x00005555563920bc in HIDAPI_DriverJoyCons_IsSupportedDevice (device=0x555556c69d60, name=0x555556c84630 "Nintendo Pro Controller", type=SDL_GAMEPAD_TYPE_NINTENDO_SWITCH_PRO, vendor_id=1406, 
    product_id=8201, version=528, interface_number=0, interface_class=0, interface_subclass=0, interface_protocol=0) at /tmp/Granite/third_party/sdl3/src/joystick/hidapi/SDL_hidapi_switch.c:1112
#8  0x00005555562d0076 in HIDAPI_GetDeviceDriver (device=0x555556c69d60) at /tmp/Granite/third_party/sdl3/src/joystick/hidapi/SDL_hidapijoystick.c:334
#9  0x00005555562d0445 in HIDAPI_SetupDeviceDriver (device=0x555556c69d60, removed=0x7fffffffdc60) at /tmp/Granite/third_party/sdl3/src/joystick/hidapi/SDL_hidapijoystick.c:508
#10 0x00005555562d13d1 in HIDAPI_AddDevice (info=0x555556c39ce0, num_children=0, children=0x0) at /tmp/Granite/third_party/sdl3/src/joystick/hidapi/SDL_hidapijoystick.c:947
#11 0x00005555562d1971 in HIDAPI_UpdateDeviceList () at /tmp/Granite/third_party/sdl3/src/joystick/hidapi/SDL_hidapijoystick.c:1111
#12 0x00005555562d06e4 in HIDAPI_JoystickInit () at /tmp/Granite/third_party/sdl3/src/joystick/hidapi/SDL_hidapijoystick.c:600
#13 0x00005555561ee13c in SDL_InitJoysticks () at /tmp/Granite/third_party/sdl3/src/joystick/SDL_joystick.c:343
#14 0x00005555562da86a in SDL_InitSubSystem_REAL (flags=512) at /tmp/Granite/third_party/sdl3/src/SDL.c:305
#15 0x00005555562da6d4 in SDL_InitOrIncrementSubsystem (subsystem=512) at /tmp/Granite/third_party/sdl3/src/SDL.c:170
#16 0x00005555562da8b2 in SDL_InitSubSystem_REAL (flags=24608) at /tmp/Granite/third_party/sdl3/src/SDL.c:323
#17 0x00005555562da962 in SDL_Init_REAL (flags=24608) at /tmp/Granite/third_party/sdl3/src/SDL.c:391
#18 0x00005555561bb0de in SDL_Init_DEFAULT (a=24608) at /tmp/Granite/third_party/sdl3/src/dynapi/SDL_dynapi_procs.h:499
#19 0x00005555561c4bd0 in SDL_Init (a=24608) at /tmp/Granite/third_party/sdl3/src/dynapi/SDL_dynapi_procs.h:499
#20 0x0000555555617834 in Granite::WSIPlatformSDL::init (this=0x555556c0a5d0, name="granite", width_=1280, height_=720) at /tmp/Granite/application/platforms/application_sdl3.cpp:127
#21 0x0000555555614cd4 in Granite::application_main (query_application_interface=0x555556194d95 <Granite::query_application_interface(Granite::ApplicationQuery, void*, unsigned long)>, 
    create_application=0x555555607d69 <Granite::application_create(int, char**)>, argc=1, argv=0x7fffffffe198) at /tmp/Granite/application/platforms/application_sdl3.cpp:779
#22 0x000055555561460a in main (argc=1, argv=0x7fffffffe198) at /tmp/Granite/application/application_entry.cpp:93

Kernel:

Linux ryzen 6.6.10-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 05 Jan 2024 16:20:41 +0000 x86_64 GNU/Linux

Themaister commented 9 months ago

Tried updating submodule to 649556befa156201116a4f25089597463d0efd44 and still happens.

slouken commented 9 months ago

Does it happen with the SDL testcontroller test program?

slouken commented 9 months ago

I don't have arch Linux, but it's working here on Ubuntu 22.04:

[INFO]: Targeting VK_KHR_present_wait latency to 1 frames.
INFO: SDL EVENT: SDL_EVENT_JOYSTICK_ADDED (timestamp=1000000000 which=2)
INFO: SDL EVENT: SDL_EVENT_GAMEPAD_ADDED (timestamp=1000000000 which=2)
[ERROR]: Failed to load Vulkan library.
INFO: SDL EVENT: SDL_EVENT_JOYSTICK_REMOVED (timestamp=1000000000 which=2)

Can you debug and find out why you're not getting any response from the controller?

Also, I'm assuming you're testing wired here, not Bluetooth?

Themaister commented 9 months ago

Does it happen with the SDL testcontroller test program?

What is that and how do I build it?

Also, I'm assuming you're testing wired here, not Bluetooth?

Wired, yes.

slouken commented 9 months ago

Configure SDL with -DSDL_TESTS=ON, and then run tests/testcontroller

Themaister commented 9 months ago

It seems to work in the testcontroller, but it takes over 10 seconds to scan udev devices it seems. Wonder if that's what I was seeing. I can get past the hang in Granite now as well, and the controller worked, but takes 20+ seconds. It seems to also happen without controller plugged in, so it might be something pathological happening on my system in particular.

Themaister commented 9 months ago

I observed more hangs, but only in Granite so far ... I'll have to test this on more systems.

endrift commented 9 months ago

This could be tied to kernel version. I'll try seeing if I can reproduce on different kernels.

In the meantime, can you check which kernel driver is in use for the hid device that's opened?

Edit: Works normally on 6.1.69-1-lts.

endrift commented 9 months ago

I'm able to reproduce on 6.6.10-arch1-1

I'm not entirely sure what's going on here, but it looks like the reply ack from a command gets eaten by something (possibly the kernel module, but even rmmodding it doesn't fix it), and the SDL_GetTicks failsafe breaks--SDL_GetTicks() always returns 1000 no matter how long it's been running for. This might be a kernel bug compounded by an SDL bug.

endrift commented 9 months ago

The SDL_GetTicks issue seems to be caused by SDL_TIMERS_DISABLED being set by the Granite config. However, the SDL codepath should bail out if enough iterations are hit anyway.

Themaister commented 9 months ago

In the meantime, can you check which kernel driver is in use for the hid device that's opened?

hid_nintendo

The SDL_GetTicks issue seems to be caused by SDL_TIMERS_DISABLED being set by the Granite config.

I tried enabling that, and while I don't get a full hang anymore, the controller doesn't work either. Probably due to some timeout that caused it to not be added.