libsdl-org / SDL

Simple Directmedia Layer
https://libsdl.org
zlib License
9.37k stars 1.74k forks source link

Bisected slowness in 2.30.0 when initializing gamepads #9092

Closed cspiegel closed 5 months ago

cspiegel commented 6 months ago

This is on Linux.

When calling SDL_Init(SDL_INIT_GAMECONTROLLER), 2.30 is noticeably slower for me than 2.28: 3 seconds vs 0.5 seconds.

I've bisected it to commit 3b1e0e16.

From what I can tell, the slowness is due to SDL_UDEV_GetProductInfo(). Prior to the bisected commit, this function was only called when a JSIOCGNAME ioctl was successful, i.e. if a Joystick name could be obtained; so at most it's called as many times as there are connected joysticks. Now it's called unconditionally, presumably at least as many times as there are input devices attached, and on my system, it takes around 100ms to run. I have 21 /dev/input/event* files, so that's 2100ms.

parkerlreed commented 5 months ago

I realized I was looking at the goto done immediately below the fd close not realizing it doesnt get called. Ill add some more more prints below this.

EDIT: So it doesnt hit the goto because doesnt print FALLBACK MaybeAddDevice fd lt 0 done but doesnt hit the rest? I added a bunch more prints but they never got hit

[2024-03-25 17:29:01.363295] INFO: FALLBACK MaybeAddDevice Begin
[2024-03-25 17:29:01.363305] INFO: FALLBACK MaybeAddDevice SDL_LockJoysticks
[2024-03-25 17:29:01.363316] INFO: FALLBACK MaybeAddDevice SDL_joylist check
[2024-03-25 17:29:01.363327] INFO: FALLBACK MaybeAddDevice Sensor List
[2024-03-25 17:29:01.363338] INFO: FALLBACK MaybeAddDevice fd open
[2024-03-25 17:29:01.363352] INFO: FALLBACK MaybeAddDevice fd close
[2024-03-25 17:29:01.432286] INFO: FALLBACK SDL_UnlockJoysticks
    SDL_Log("FALLBACK MaybeAddDevice fd open");
    fd = open(path, O_RDONLY | O_CLOEXEC, 0);
    SDL_Log("FALLBACK MaybeAddDevice fd close");
    if (fd < 0) {
        SDL_Log("FALLBACK MaybeAddDevice fd lt 0 done");
        goto done;
    }

#ifdef DEBUG_INPUT_EVENTS
    SDL_Log("Checking %s\n", path);
#endif

    if (IsJoystick(path, fd, &name, &vendor, &product, &guid)) {
#ifdef DEBUG_INPUT_EVENTS
        SDL_Log("found joystick: %s\n", path);
#endif
        SDL_Log("FALLBACK MaybeAddDevice item eq SDL_joylist_item");
        item = (SDL_joylist_item *)SDL_calloc(1, sizeof(SDL_joylist_item));
        SDL_Log("FALLBACK MaybeAddDevice item ne");
        if (!item) {
            SDL_Log("FALLBACK MaybeAddDevice item SDL_free");
            SDL_free(name);
            goto done;
        }

        item->devnum = sb.st_rdev;
        item->steam_virtual_gamepad_slot = -1;
        item->path = SDL_strdup(path);
        item->name = name;
        item->guid = guid;

        SDL_Log("FALLBACK MaybeAddDevice Valve Virtual Check");
        if (vendor == USB_VENDOR_VALVE &&
            product == USB_PRODUCT_STEAM_VIRTUAL_GAMEPAD) {
            GetSteamVirtualGamepadSlot(fd, &item->steam_virtual_gamepad_slot);
        }

        SDL_Log("FALLBACK MaybeAddDevice not item path or not item name");
        if ((!item->path) || (!item->name)) {
            SDL_Log("FALLBACK MaybeAddDevice FreeJoylistItem");
            FreeJoylistItem(item);
            goto done;
        }

        SDL_Log("FALLBACK MaybeAddDevice not SDL joylist tail");
        item->device_instance = SDL_GetNextJoystickInstanceID();
        if (!SDL_joylist_tail) {
            SDL_joylist = SDL_joylist_tail = item;
        } else {
            SDL_Log("FALLBACK MaybeAddDevice tail next");
            SDL_joylist_tail->next = item;
            SDL_joylist_tail = item;
        }

        /* Need to increment the joystick count before we post the event */
        SDL_Log("FALLBACK MaybeAddDevice Increment Joystick");
        ++numjoysticks;

        SDL_Log("FALLBACK MaybeAddDevice PrivateJoystickAdded");
        SDL_PrivateJoystickAdded(item->device_instance);
        goto done;
    }
parkerlreed commented 5 months ago

It's the second close???

    SDL_Log("FALLBACK MaybeAddDevice fd gt o eq 0 close");
    if (fd >= 0) {
        close(fd);
    }
    SDL_Log("FALLBACK MaybeAddDevice Close done");
    SDL_Log("FALLBACK SDL_UnlockJoysticks");
    SDL_UnlockJoysticks();
}
[2024-03-25 17:36:42.888811] INFO: FALLBACK MaybeAddDevice Begin
[2024-03-25 17:36:42.888826] INFO: FALLBACK MaybeAddDevice SDL_LockJoysticks
[2024-03-25 17:36:42.888839] INFO: FALLBACK MaybeAddDevice SDL_joylist check
[2024-03-25 17:36:42.888851] INFO: FALLBACK MaybeAddDevice Sensor List
[2024-03-25 17:36:42.888863] INFO: FALLBACK MaybeAddDevice fd open
[2024-03-25 17:36:42.888875] INFO: FALLBACK MaybeAddDevice fd close
[2024-03-25 17:36:42.889008] INFO: FALLBACK MaybeAddDevice fd gt o eq 0 close
[2024-03-25 17:36:42.928744] INFO: FALLBACK MaybeAddDevice Close done
[2024-03-25 17:36:42.928821] INFO: FALLBACK SDL_UnlockJoysticks
parkerlreed commented 5 months ago

For shits I removed that second close and it seems FINE

[2024-03-25 17:40:26.543610] INFO: FALLBACK MaybeAddDevice Begin
[2024-03-25 17:40:26.543617] INFO: FALLBACK MaybeAddDevice SDL_LockJoysticks
[2024-03-25 17:40:26.543625] INFO: FALLBACK MaybeAddDevice SDL_joylist check
[2024-03-25 17:40:26.543632] INFO: FALLBACK MaybeAddDevice Sensor List
[2024-03-25 17:40:26.543640] INFO: FALLBACK MaybeAddDevice fd open
[2024-03-25 17:40:26.543648] INFO: FALLBACK MaybeAddDevice fd close
[2024-03-25 17:40:26.543681] INFO: FALLBACK MaybeAddDevice fd gt o eq 0 close
[2024-03-25 17:40:26.543690] INFO: FALLBACK MaybeAddDevice Close done
[2024-03-25 17:40:26.543698] INFO: FALLBACK SDL_UnlockJoysticks
[2024-03-25 17:40:26.543705] INFO: FALLBACK ScanInputDevices free 1
[2024-03-25 17:40:26.543712] INFO: FALLBACK ScanInputDevices SDL_snprint
[2024-03-25 17:40:26.543720] INFO: FALLBACK ScanInputDevices MaybeAddDevice
[2024-03-25 17:40:26.543727] INFO: FALLBACK MaybeAddDevice Begin
[2024-03-25 17:40:26.543734] INFO: FALLBACK MaybeAddDevice SDL_LockJoysticks
[2024-03-25 17:40:26.543741] INFO: FALLBACK MaybeAddDevice SDL_joylist check
[2024-03-25 17:40:26.543749] INFO: FALLBACK MaybeAddDevice Sensor List
[2024-03-25 17:40:26.543757] INFO: FALLBACK MaybeAddDevice fd open
[2024-03-25 17:40:26.543764] INFO: FALLBACK MaybeAddDevice fd close
[2024-03-25 17:40:26.543797] INFO: FALLBACK MaybeAddDevice fd gt o eq 0 close
[2024-03-25 17:40:26.543806] INFO: FALLBACK MaybeAddDevice Close done
[2024-03-25 17:40:26.543813] INFO: FALLBACK SDL_UnlockJoysticks
parkerlreed commented 5 months ago

For reference this speedup was just removing this single close https://github.com/libsdl-org/SDL/blob/main/src/joystick/linux/SDL_sysjoystick.c#L511

parkerlreed commented 5 months ago

I told testgamecontroller to exit right before window creation.

Removing the close is not as big of a time sink as I thought.

INFO: FALLBACK MaybeAddDevice Begin
INFO: FALLBACK MaybeAddDevice SDL_LockJoysticks
INFO: FALLBACK MaybeAddDevice SDL_joylist check
INFO: FALLBACK MaybeAddDevice Already Have This One
INFO: FALLBACK MaybeAddDevice fd gt o eq 0 close
INFO: FALLBACK MaybeAddDevice Close done
INFO: FALLBACK SDL_UnlockJoysticks
INFO: Rumble supported
INFO: XBox 360 Controller 0: X360 Controller, /dev/input/event11 (guid 030081b85e0400008e02000014010000, VID 0x045e, PID 0x028e, player index = 0)
INFO: There are 1 game controller(s) attached (1 joystick(s))

real    0m2.534s
user    0m0.024s
sys     0m0.062s

Adding it back

INFO: FALLBACK MaybeAddDevice Begin
INFO: FALLBACK MaybeAddDevice SDL_LockJoysticks
INFO: FALLBACK MaybeAddDevice SDL_joylist check
INFO: FALLBACK MaybeAddDevice Already Have This One
INFO: FALLBACK MaybeAddDevice fd gt o eq 0 close
INFO: FALLBACK MaybeAddDevice Close done
INFO: FALLBACK SDL_UnlockJoysticks
INFO: Rumble supported
INFO: XBox 360 Controller 0: X360 Controller, /dev/input/event11 (guid 030081b85e0400008e02000014010000, VID 0x045e, PID 0x028e, player index = 0)
INFO: There are 1 game controller(s) attached (1 joystick(s))

real    0m2.639s
user    0m0.031s
sys     0m0.063s
parkerlreed commented 5 months ago

Sorry for the comment spam! Just an oddball of what could be causing the delay.

twhitehead commented 5 months ago

No problem from my end. I would like more to know what is going on.

With regard to the earlier questions at gprof and the -pg option. You need to provide it for compiling and linking.

I looked through the SDL2 autoconf build system, and my take was you need to set CFLAGS and LDFLAGS before running configure.

Running the program should produce a profile file you can then run gprof on. I'll mess around more with it once I get back.

parkerlreed commented 5 months ago

Today I learned about samply

Here's a profile link for running the testgamecontroller that I made exit before GUI

https://share.firefox.dev/496L8X3

parkerlreed commented 5 months ago

Each bump in the graph is a device being scanned... slowly hah.

Here's wine's profile. Same deal https://share.firefox.dev/3IRiM8v

Might be choking in systemd/udev itself

slouken commented 5 months ago

That is a neat tool.

It looks like you're not running the latest code with @twhitehead's changes. Inside SDL_UDEV_GetProductInfo() there is a call to udev_device_new_from_devnum() instead of udev_enumerate_scan_devices().

Can you try updating to the latest code from git and trying again?

parkerlreed commented 5 months ago

Oh woops yeah. So we're flat with latest SDL2 branch but still taking a second or two to load

https://share.firefox.dev/3Pyex5J

EDIT: This is commit 5fff9bc1e3089ab4b33faf8b734fab7e2689b536

parkerlreed commented 5 months ago

perf might be capturing the controller test better

image

Happy to delve into anything there

slouken commented 5 months ago

Oh woops yeah. So we're flat with latest SDL2 branch but still taking a second or two to load

So we're basically back where we started. Great!

It still seems unusual that it would take 2 seconds on your system. Note that the CPU sampling is basically blank for most of that time, which means it's stuck in some IO call. Do you have any old devices connected to your system that might be having USB timeouts or anything?

parkerlreed commented 5 months ago

Unplugging my mess of a USB hub setup, 1.6s

https://share.firefox.dev/49dsIEj

Unloading xpad (ROG Ally built in controller is Xbox), 1.4s

https://share.firefox.dev/3TQhb9j

All I really have left is the built in ASUS keyboard buttons and the non functional fingerprint reader and Bluetooth

[parker@rogally sdl2-git]$ lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 045e:028e Microsoft Corp. Xbox360 Controller
Bus 001 Device 003: ID 0b05:1abe ASUSTek Computer, Inc. N-KEY Device
Bus 001 Device 004: ID 0489:e0f5 Foxconn / Hon Hai Wireless_Device
Bus 001 Device 005: ID 1c7a:0588 LighTuning Technology Inc. ETU905A86-E
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 006 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 007 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 008 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
radgeRayden commented 5 months ago

Removing all my USB peripherals takes me to an initialization time of ~1.15s.

parkerlreed commented 5 months ago

Yeah I JUST saw the same. After running unbind on the ASUS, Xbox and fingerprint I'm just over a second

https://share.firefox.dev/4cyeoZE

radgeRayden commented 5 months ago

Tested adding the peripherals one by one, but only got a steady increase in loading time until I hit a result that varied between 1.85s-2s. Anecdotally I felt the steepest increases when adding the DS4 controller and my razer pro click mouse. Also adding my own results from samply: https://share.firefox.dev/3xcy859

twhitehead commented 5 months ago

Thanks for the nice profile info. From poking around on it (looking at the flame graphs, zooming in on the blank areas), I get the feeling this may be related to udev in general. For example, it looks to me like

Could there be some range of udev versions that suffer from very long IO waits? Maybe if we need to know what version of udev people that are experiencing the issue are running we would see a pattern. And maybe the distro too just in case there is something distro specific about it.

twhitehead commented 5 months ago

For reference, if I change (edit: this is backwards)

static SDL_bool done = SDL_TRUE;

to SDL_FALSE in test/testgamecontroller.c so it exits right away, my runitime on my laptop (a very old Intel Core m5) is

[tyson@manny:~/SDL/SDL2/test]$ time ../outputs/out/libexec/installed-tests/SDL2/testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m0.068s
user    0m0.019s

which is so quick I don't get any useful profile information via grof as it is using a 0.01s sample interval.

I am running NixOS 23.11 with the systemd 254.6 udev.

parkerlreed commented 5 months ago

Arch Linux

systemd 255.4-2 (udev)

What testgamecontroller.c are you looking at? The file without any modifications as it stands has

static SDL_bool done = SDL_FALSE;

So that seems to have no affect on if it exits as that's the default value.

(Setting it to TRUE also doesn't exit...)

EDIT: Oh you dont have a gamepad connected. Let me test that.

Yeah the default is FALSE anyways so with no gamepad it exits and I see the delay

[parker@rogally test]$ time ./testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m2.134s
user    0m0.064s
sys     0m0.072s
parkerlreed commented 5 months ago

Attempted a downgrade to 254 and while I think the downgrade worked, still seeing the delay.

Kernel version for good measure

$ uname -a
Linux rogally 6.8.1-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 16 Mar 2024 17:15:35 +0000 x86_64 GNU/Linux
twhitehead commented 5 months ago

Right. I reversed those when I cut and pasted into the comment. I changed it to SDL_TRUE. There was a routine that conditionally ets it to SDL_FALSE, so that may be triggering for you and not me.

parkerlreed commented 5 months ago

Ahh thanks. Yeah no gamepads and true same deal

[parker@rogally test]$ time ./testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m2.330s
user    0m0.061s
sys     0m0.080s

Going to test some older kernel versions and see how it fares.

parkerlreed commented 5 months ago

Interesting. 6.4 has a lower average

[parker@rogally sdl2-git]$ time ./src/SDL/test/testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m1.985s
user    0m0.045s
sys     0m0.064s
[parker@rogally sdl2-git]$ time ./src/SDL/test/testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m1.730s
user    0m0.058s
sys     0m0.058s
[parker@rogally sdl2-git]$ time ./src/SDL/test/testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m1.617s
user    0m0.060s
sys     0m0.048s
[parker@rogally sdl2-git]$ time ./src/SDL/test/testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m2.003s
user    0m0.087s
sys     0m0.061s
parkerlreed commented 5 months ago

Ok following a random hunch, back on 6.8.1, I saw in perf a bit of time was spent in CPU idle states.

Added cpuidle.off=1 to the kernel and this actually dropped it quite a bit

[parker@rogally sdl2-git]$ time ./src/SDL/test/testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m1.682s
user    0m0.045s
sys     0m0.081s
[parker@rogally sdl2-git]$ time ./src/SDL/test/testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m1.724s
user    0m0.050s
sys     0m0.071s
[parker@rogally sdl2-git]$ time ./src/SDL/test/testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m1.734s
user    0m0.046s
sys     0m0.079s
[parker@rogally sdl2-git]$ time ./src/SDL/test/testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m1.682s
user    0m0.037s
sys     0m0.088s
parkerlreed commented 5 months ago

Without cpuidle.off=1 (and making sure I was on latest amd-ucode) its back up to 2.0-25 seconds.

Been messing with CPU idle settings with cpupower but haven't had any meaningful effect.

parkerlreed commented 5 months ago

If I stress all 16 threads powertop looks like

image

Do nothing? This thing is idling as hard as it can

image

Could this be some CPU bug where it's not waking very fast from idle?

Z1 Extreme with the amd-pstate-epp driver

processor       : 15
vendor_id       : AuthenticAMD
cpu family      : 25
model           : 116
model name      : AMD Ryzen Z1 Extreme
stepping        : 1
microcode       : 0xa704104
cpu MHz         : 400.000
cache size      : 1024 KB
physical id     : 0
siblings        : 16
core id         : 7
cpu cores       : 8
apicid          : 15
initial apicid  : 15
fpu             : yes
fpu_exception   : yes
cpuid level     : 16
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid overflow_recov succor smca flush_l1d
bugs            : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass srso
bogomips        : 6590.83
TLB size        : 3584 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14] [15]

Here's why I was looking at idle image

parkerlreed commented 5 months ago

Taking the kernel work out here's just testgamecontroller image

parkerlreed commented 5 months ago

Holy crap this is kernel. I had used this kernel before as it's a 6.1 with Valve's patches included. Just tried it and it completely solves the issue

https://github.com/rog-ally-gaming/linux-chimeraos/releases/tag/6.1.81.nrflx3-2

[parker@rogally sdl2-git]$ time ./src/SDL/test/testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m0.703s
user    0m0.062s
sys     0m0.070s

Samply https://share.firefox.dev/49dPNqc

parkerlreed commented 5 months ago

For reference, if I change

static SDL_bool done = SDL_TRUE;

to SDL_FALSE in test/testgamecontroller.c so it exits right away, my runitime on my laptop (a very old Intel Core m5) is

[tyson@manny:~/SDL/SDL2/test]$ time ../outputs/out/libexec/installed-tests/SDL2/testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real  0m0.068s
user  0m0.019s

which is so quick I don't get any useful profile information via grof as it is using a 0.01s sample interval.

I am running NixOS 23.11 with the systemd 254.6 udev.

Still not quite as quick as this but damn close.

parkerlreed commented 5 months ago

And it's 0.7 regardless of how much USB I have hooked up so scanning seems completely solved.

Still curious as to why it's still 10x slower overall.

twhitehead commented 5 months ago

Just checked. I am running the stock NixOS Linux kernel 6.1.68. So that is in agreement with your observation of no slow down in 6.1.81. Wonder if I would see the issue if I switched to a 6.4 or 6.8 like you...

twhitehead commented 5 months ago

Seems NixOS won't let me use 6.4 as it has reached end of life. I did try with 6.5 though, and it seems to be fine

[tyson@manny:~/SDL/SDL2]$ uname -ar
Linux manny 6.5.13 #1-NixOS SMP PREEMPT_DYNAMIC Tue Nov 28 17:15:19 UTC 2023 x86_64 GNU/Linux

[tyson@manny:~/SDL/SDL2]$ time ./outputs/out/libexec/installed-tests/SDL2/testgamecontroller
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m0.086s
user    0m0.030s
sys 0m0.033s

What if you use ebf-trace to do an off-cpu trace and flame graph to get an idea of what is happening at the IO level during the hangup?

twhitehead commented 5 months ago

I believe this works pretty good (using my distros bcc and flamegraph packages and running as root)

( sleep 10; exec ./outputs/out/libexec/installed-tests/SDL2/testgamecontroller; ) & offcputime -f 20 -p $! > testgamecontroller.log; flamegraph.pl testgamecontroller.log > testgamecontroller.svg

and then just view the svg file using your browser. Here is the one from my laptop running the 6.5.13 kernel.

parkerlreed commented 5 months ago

So even on the good kernel running with root is as slow as newer kernel?? (Getting the bpf stuff setup just noticed this in preparation)

[parker@rogally test]$ time ./testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m0.699s
user    0m0.061s
sys     0m0.076s
[parker@rogally test]$ time sudo ./testgamecontroller 
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m2.362s
user    0m0.006s
sys     0m0.004s
parkerlreed commented 5 months ago

Running under root (sudo su)

image

parkerlreed commented 5 months ago

Oh wow I didnt realize what you meant at first.... I've never seen an interactive SVG O.o

And Github just converts it... How did you upload yours?

I try attaching it a as file and it seems to do the same as yours but it wont let me interact with the result

twhitehead commented 5 months ago

I believe I may have uploaded it as an image, realized it was breaking the interaction, and cut and pasted the url into a markdown link instead of uploading it again. Or I may have just uploaded it again as a file and then cut and pasted that link into my markdown link. Can't recall.

parkerlreed commented 5 months ago

Here

Yep reformat into regular link works

So that's WITH sudo slowing it down. I tried unpriviledged bpf but can't get it working (to see the 0.7 second result)

parkerlreed commented 5 months ago

Running as user but offcputime with sudo worked

testgamecontroller

libusb maybe suspect?

parkerlreed commented 5 months ago

My best uneducated guess from looking at my svg is:

Most of my time is spent pre testgamecontorller init in early USB. If you just look at testgamecontroller, the execution time there is what is expected. So what could be causing a libsub hangup so early?

twhitehead commented 5 months ago

Yeah. It does seem like there is something USB related going on. Not sure if it is early or not though as the order is by function name and not time. The [unknown] entries lump everything together so you can't tell what is where.

With some googling and experimenting, it seems there are two items you need to fully resolve the stack traces

The later isn't too bad as you can just add a sleep before main returns. The former is more tricky as you have to do all the dependencies too (a missing frame pointer higher up in the stack entirely obscures everything below it regardless of whether they have frame pointers or not).

I get the feeling this may be the default it Fedora and Ubuntu now though. It currently isn't the default in NixOS, but NixOS does give the ability to override everything, so I have done that and am rebuilding. It is a lot of packages though, so it will take a long time.

I think I can then package that as an RPM or DEB that you can install to test it and them remove to get rid of it. Assuming your system supports one of these formats?

parkerlreed commented 5 months ago

Arch I THINK defaults to omit (at least the local makepkg does I dont know how to confirm for official packages)

But I can easily compile official packages with the local makepkg so this should be doable.

CFLAGS="-march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions \
        -Wp,-D_FORTIFY_SOURCE=3 -Wformat -Werror=format-security \
        -fstack-clash-protection -fcf-protection \
        -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer"
parkerlreed commented 5 months ago

libsub recompiled (sdl2-git was already using the omit) still seeing a bunch of unknown. I'll see if I can figure out what those are supposed to be.

twhitehead commented 5 months ago

Good to hear that you can rebuild with -fno-omit-frame-pointer and -mno-omit-leaf-pointer.

With regard to running as root being slower on 6.1 as well, I wonder if you have a /dev or /sys file that is slow when probed and switched form being root accessible only under 6.1 to user accessible under 6.4?

Maybe doing

find /dev /sys -printf '%p %m %u %g\n' | sort

under both and diffing them might reveal something?

parkerlreed commented 5 months ago

So I just did that find for 6.1 on both user and root and I can visibly notice it taking longer to run for root... Going to add the timestamps to execution and see if anything pops out.

twhitehead commented 5 months ago

Look at /proc//maps to see what so files it sucks in while it is sleeping before exiting (some are dynamically loaded, so they won't show up with ldd).

parkerlreed commented 5 months ago

6.1 root taking longer may be due to the fact it exposes /sys/kernel/tracing/ which is a doozy

6.1 to 6.8 is oddly 2.2 MB smaller

6.1-user.txt 6.8-user.txt

twhitehead commented 5 months ago

In @radgeRayden's earlier strace -r log, I noticed some of the /dev/input/eventXX stats were taking between 1/10th and 1/100th of a second compared to the others which where 2/10,000th to 4/10,000th of a second.

To quote the output, which is sorted by time, we have a bunch in 0.000028 to 0.000061

     0.000028 newfstatat(AT_FDCWD, "/dev/input/event21", {st_mode=S_IFCHR|0660, st_rdev=makedev(0xd, 0x55), ...}, 0) = 0
...
     0.000061 newfstatat(AT_FDCWD, "/dev/input/event19", {st_mode=S_IFCHR|0660, st_rdev=makedev(0xd, 0x53), ...}, 0) = 0

and then, with nothing in between, a bunch more between 0.010 and 0.111

     0.010093 newfstatat(AT_FDCWD, "/dev/input/event21", {st_mode=S_IFCHR|0660, st_rdev=makedev(0xd, 0x55), ...}, 0) = 0
...
     0.105205 newfstatat(AT_FDCWD, "/dev/input/event14", {st_mode=S_IFCHR|0660, st_rdev=makedev(0xd, 0x4e), ...}, 0) = 0
parkerlreed commented 5 months ago

Ok we can safely narrow it down to /dev most likely. I mount overbinded a blank folder to /dev and 6.8 execution is quick

Only /sys being blank had no effect

[parker@rogally test]$ time ./testgamecontroller
INFO: There are 0 game controller(s) attached (0 joystick(s))

real    0m0.226s
user    0m0.057s
sys     0m0.071s