libsdl-org / SDL

Simple Directmedia Layer
https://libsdl.org
zlib License
8.74k stars 1.65k forks source link

Binary size reduction #9206

Open icculus opened 4 months ago

icculus commented 4 months ago

But yes, opening a new issue to discuss size reduction is a good idea.

Originally posted by @slouken in https://github.com/libsdl-org/SDL/issues/9054#issuecomment-1981312050

icculus commented 4 months ago

Okay, here we are.

Some notes:

Here's what bloaty says about SDL3 with MinSizeRel plus debug information...

icculus@lucha:~/Desktop/bloaty/buildbot$ ./bloaty ~/projects/SDL-icculus/buildbot/libSDL3.so -d compileunits -n 0 -s file
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  11.7%  1.51Mi   0.0%       0    [section .debug_str]
  11.5%  1.49Mi   0.0%       0    [section .debug_macro]
  10.0%  1.29Mi   0.0%       0    [section .debug_loclists]
   5.0%   667Ki  10.8%   195Ki    /home/icculus/projects/SDL-icculus/src/dynapi/SDL_dynapi.c
   2.6%   338Ki   1.9%  34.3Ki    /home/icculus/projects/SDL-icculus/src/video/yuv2rgb/yuv_rgb_sse.c
   2.2%   292Ki   2.5%  45.1Ki    /home/icculus/projects/SDL-icculus/src/render/vulkan/SDL_render_vulkan.c
   1.4%   183Ki   3.6%  65.2Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_blit_auto.c
   1.3%   171Ki   5.2%  94.4Ki    /home/icculus/projects/SDL-icculus/src/joystick/SDL_gamepad.c
   1.1%   145Ki   2.2%  39.7Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_video.c
   1.0%   137Ki   1.4%  26.2Ki    /home/icculus/projects/SDL-icculus/src/video/wayland/SDL_waylandevents.c
   1.0%   135Ki   2.1%  38.2Ki    /home/icculus/projects/SDL-icculus/src/render/SDL_render.c
   1.0%   126Ki   3.1%  56.9Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_blit_N.c
   0.9%   116Ki   2.4%  43.9Ki    /home/icculus/projects/SDL-icculus/src/render/software/SDL_blendline.c
   0.8%   105Ki   0.0%       0    [section .debug_rnglists]
   0.8%   104Ki   1.3%  24.0Ki    /home/icculus/projects/SDL-icculus/src/video/wayland/SDL_waylandwindow.c
   0.8%   102Ki   0.9%  16.7Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11window.c
   0.7%  98.7Ki   1.4%  25.6Ki    /home/icculus/projects/SDL-icculus/src/joystick/SDL_joystick.c
   0.7%  97.5Ki   0.7%  12.7Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11dyn.c
   0.7%  91.7Ki   0.7%  12.2Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11events.c
   0.7%  89.8Ki   0.9%  16.3Ki    /home/icculus/projects/SDL-icculus/src/audio/pipewire/SDL_pipewire.c
   0.7%  87.7Ki   1.4%  25.8Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_blit_A.c
   0.6%  84.8Ki   1.3%  24.0Ki    /home/icculus/projects/SDL-icculus/src/video/yuv2rgb/yuv_rgb_std.c
   0.6%  83.2Ki   1.2%  22.4Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_yuv.c
   0.6%  81.1Ki   0.5%  8.32Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11modes.c
   0.6%  77.5Ki   1.0%  17.2Ki    /home/icculus/projects/SDL-icculus/src/audio/SDL_audio.c
   0.6%  77.4Ki   1.0%  18.5Ki    /home/icculus/projects/SDL-icculus/src/audio/SDL_audiocvt.c
   0.6%  76.3Ki   1.0%  17.5Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_switch.c
   0.6%  75.4Ki   1.0%  19.0Ki    /home/icculus/projects/SDL-icculus/src/render/opengl/SDL_render_gl.c
   0.5%  72.2Ki   1.0%  18.3Ki    /home/icculus/projects/SDL-icculus/src/render/opengles2/SDL_render_gles2.c
   0.5%  72.2Ki   1.0%  18.7Ki    /home/icculus/projects/SDL-icculus/src/joystick/linux/SDL_sysjoystick.c
   0.5%  71.9Ki   0.8%  15.1Ki    /home/icculus/projects/SDL-icculus/src/hidapi/SDL_hidapi.c
   0.5%  70.2Ki   0.6%  11.1Ki    /home/icculus/projects/SDL-icculus/src/video/wayland/SDL_waylandvideo.c
   0.5%  69.8Ki   0.4%  7.92Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11opengl.c
   0.5%  69.3Ki   1.5%  27.8Ki    /home/icculus/projects/SDL-icculus/src/render/software/SDL_blendfillrect.c
   0.5%  68.2Ki   0.3%  6.00Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11messagebox.c
   0.5%  64.1Ki   0.1%  2.43Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11clipboard.c
   0.5%  63.9Ki   0.2%  3.57Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11xinput2.c
   0.5%  62.3Ki   0.2%  3.82Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11mouse.c
   0.5%  61.6Ki   0.2%  4.48Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11video.c
   0.5%  61.5Ki   1.1%  19.6Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_blit_0.c
   0.5%  60.7Ki   0.3%  5.98Ki    /home/icculus/projects/SDL-icculus/src/core/linux/SDL_ibus.c
   0.5%  60.6Ki   0.9%  15.9Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_RLEaccel.c
   0.4%  59.4Ki   0.1%  2.07Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11vulkan.c
   0.4%  58.9Ki   1.0%  17.5Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapijoystick.c
   0.4%  58.4Ki   0.2%  2.97Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11pen.c
   0.4%  57.5Ki   0.2%  2.82Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11keyboard.c
   0.4%  55.6Ki   0.5%  9.23Ki    /home/icculus/projects/SDL-icculus/src/audio/pulseaudio/SDL_pulseaudio.c
   0.4%  54.7Ki   0.2%  3.65Ki    /home/icculus/projects/SDL-icculus/src/core/linux/SDL_fcitx.c
   0.4%  54.1Ki   0.5%  9.70Ki    /home/icculus/projects/SDL-icculus/src/video/kmsdrm/SDL_kmsdrmvideo.c
   0.4%  53.3Ki   1.6%  29.1Ki    /home/icculus/projects/SDL-icculus/src/video/x11/edid-parse.c
   0.4%  52.9Ki   0.5%  9.92Ki    /home/icculus/projects/SDL-icculus/src/camera/SDL_camera.c
   0.4%  51.8Ki   0.3%  5.80Ki    /home/icculus/projects/SDL-icculus/src/video/wayland/SDL_waylandmouse.c
   0.4%  51.8Ki   0.9%  16.7Ki    /home/icculus/projects/SDL-icculus/src/events/SDL_events.c
   0.4%  51.5Ki   0.1%  1.80Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11framebuffer.c
   0.4%  50.6Ki   1.1%  19.8Ki    /home/icculus/projects/SDL-icculus/src/events/SDL_keyboard.c
   0.4%  50.4Ki   0.4%  6.56Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_steam.c
   0.4%  50.2Ki   0.1%  1.50Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11xfixes.c
   0.4%  49.9Ki   0.5%  9.05Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_ps5.c
   0.4%  49.4Ki   0.5%  8.81Ki    /home/icculus/projects/SDL-icculus/src/video/wayland/SDL_waylanddyn.c
   0.4%  49.3Ki   0.6%  10.3Ki    /home/icculus/projects/SDL-icculus/src/render/software/SDL_render_sw.c
   0.4%  48.8Ki   0.5%  9.78Ki    /home/icculus/projects/SDL-icculus/src/events/SDL_mouse.c
   0.4%  48.7Ki   0.5%  8.99Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_wii.c
   0.4%  48.2Ki   0.8%  13.7Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_surface.c
   0.4%  47.2Ki   0.4%  8.15Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_xboxone.c
   0.3%  45.1Ki   0.4%  7.73Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_ps4.c
   0.3%  44.5Ki   0.5%  9.25Ki    /home/icculus/projects/SDL-icculus/src/audio/alsa/SDL_alsa_audio.c
   0.3%  44.4Ki   0.1%  1.00Ki    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11opengles.c
   0.3%  44.3Ki   0.7%  12.3Ki    /home/icculus/projects/SDL-icculus/src/audio/SDL_wave.c
   0.3%  42.1Ki   0.0%     602    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11shape.c
   0.3%  41.1Ki   0.7%  12.2Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_pixels.c
   0.3%  40.7Ki   0.3%  6.12Ki    /home/icculus/projects/SDL-icculus/src/camera/v4l2/SDL_camera_v4l2.c
   0.3%  40.3Ki   0.4%  7.43Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_ps3.c
   0.3%  39.5Ki   0.0%     138    /home/icculus/projects/SDL-icculus/src/video/x11/SDL_x11touch.c
   0.3%  39.1Ki   0.7%  13.2Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_blit_1.c
   0.3%  38.2Ki   0.4%  8.00Ki    /home/icculus/projects/SDL-icculus/src/events/SDL_pen.c
   0.3%  38.2Ki   1.1%  20.3Ki    /home/icculus/projects/SDL-icculus/src/core/linux/SDL_evdev_kbd.c
   0.3%  37.8Ki   0.4%  6.55Ki    /home/icculus/projects/SDL-icculus/src/haptic/linux/SDL_syshaptic.c
   0.3%  37.7Ki   0.6%  10.8Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_egl.c
   0.3%  37.4Ki   0.2%  3.03Ki    /home/icculus/projects/SDL-icculus/src/video/kmsdrm/SDL_kmsdrmvulkan.c
   0.3%  37.2Ki   0.8%  14.1Ki    /home/icculus/projects/SDL-icculus/src/render/software/SDL_triangle.c
   0.2%  32.8Ki   0.3%  4.97Ki    /home/icculus/projects/SDL-icculus/src/video/wayland/SDL_waylanddatamanager.c
   0.2%  32.4Ki   0.4%  8.10Ki    /home/icculus/projects/SDL-icculus/src/core/linux/SDL_dbus.c
   0.2%  31.4Ki   1.8%  31.7Ki    /home/icculus/projects/SDL-icculus/src/audio/SDL_audioresample.c
   0.2%  31.4Ki   0.1%  1.86Ki    /home/icculus/projects/SDL-icculus/src/audio/SDL_audiotypecvt.c
   0.2%  31.0Ki   0.3%  5.98Ki    /home/icculus/projects/SDL-icculus/src/joystick/virtual/SDL_virtualjoystick.c
   0.2%  30.8Ki   0.3%  5.78Ki    /home/icculus/projects/SDL-icculus/src/SDL_properties.c
   0.2%  30.7Ki   0.1%  2.29Ki    /home/icculus/projects/SDL-icculus/src/SDL.c
   0.2%  30.1Ki   0.1%  1.26Ki    /home/icculus/projects/SDL-icculus/src/video/wayland/SDL_waylandclipboard.c
   0.2%  30.1Ki   0.1%  1.26Ki    /home/icculus/projects/SDL-icculus/src/video/wayland/SDL_waylandvulkan.c
   0.2%  30.0Ki   0.3%  5.58Ki    /home/icculus/projects/SDL-icculus/src/haptic/SDL_haptic.c
   0.2%  30.0Ki   0.3%  5.00Ki    /home/icculus/projects/SDL-icculus/src/core/linux/SDL_evdev.c
   0.2%  30.0Ki   0.3%  5.01Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_vulkan_utils.c
   0.2%  29.9Ki   0.2%  3.18Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_steamdeck.c
   0.2%  29.9Ki   0.3%  4.61Ki    /home/icculus/projects/SDL-icculus/src/stdlib/SDL_string.c
   0.2%  29.8Ki   0.2%  4.34Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_shield.c
   0.2%  29.7Ki   0.0%       0    [section .debug_line_str]
   0.2%  29.7Ki   0.9%  16.1Ki    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/wayland-protocol.c
   0.2%  29.7Ki   0.3%  5.04Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_gamecube.c
   0.2%  29.5Ki   0.2%  3.35Ki    /home/icculus/projects/SDL-icculus/src/events/SDL_touch.c
   0.2%  29.1Ki   0.1%  2.54Ki    /home/icculus/projects/SDL-icculus/src/video/kmsdrm/SDL_kmsdrmmouse.c
   0.2%  29.0Ki   0.3%  5.41Ki    /home/icculus/projects/SDL-icculus/src/power/linux/SDL_syspower.c
   0.2%  27.9Ki   0.2%  3.39Ki    /home/icculus/projects/SDL-icculus/src/video/kmsdrm/SDL_kmsdrmdyn.c
   0.2%  27.6Ki   0.2%  3.92Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_stretch.c
   0.2%  27.5Ki   0.5%  9.72Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_blit_slow.c
   0.2%  27.4Ki   0.2%  4.30Ki    /home/icculus/projects/SDL-icculus/src/audio/jack/SDL_jackaudio.c
   0.2%  26.8Ki   0.2%  3.64Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_luna.c
   0.2%  26.7Ki   0.3%  5.81Ki    /home/icculus/projects/SDL-icculus/src/core/linux/SDL_udev.c
   0.2%  26.5Ki   0.2%  3.52Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_xbox360w.c
   0.2%  25.2Ki   0.3%  5.47Ki    /home/icculus/projects/SDL-icculus/src/file/SDL_rwops.c
   0.2%  25.1Ki   0.2%  2.83Ki    /home/icculus/projects/SDL-icculus/src/SDL_assert.c
   0.2%  25.0Ki   0.2%  2.90Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_xbox360.c
   0.2%  24.1Ki   0.2%  2.79Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_stadia.c
   0.2%  24.0Ki   0.2%  3.41Ki    /home/icculus/projects/SDL-icculus/src/sensor/SDL_sensor.c
   0.2%  23.2Ki   0.7%  12.2Ki    /home/icculus/projects/SDL-icculus/src/joystick/controller_type.c
   0.2%  22.6Ki   0.7%  12.8Ki    /home/icculus/projects/SDL-icculus/src/render/vulkan/SDL_shaders_vulkan.c
   0.2%  22.3Ki   0.2%  4.28Ki    /home/icculus/projects/SDL-icculus/src/SDL_log.c
   0.2%  22.0Ki   0.1%  1.24Ki    /home/icculus/projects/SDL-icculus/src/video/wayland/SDL_waylandkeyboard.c
   0.2%  21.8Ki   0.1%  2.14Ki    /home/icculus/projects/SDL-icculus/src/cpuinfo/SDL_cpuinfo.c
   0.2%  21.8Ki   0.2%  2.73Ki    /home/icculus/projects/SDL-icculus/src/thread/SDL_thread.c
   0.2%  21.7Ki   0.4%  6.75Ki    /home/icculus/projects/SDL-icculus/src/render/opengl/SDL_shaders_gl.c
   0.2%  21.4Ki   0.1%     970    /home/icculus/projects/SDL-icculus/src/video/dummy/SDL_nullvideo.c
   0.2%  21.3Ki   0.1%  2.35Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_combined.c
   0.2%  21.1Ki   0.2%  2.84Ki    /home/icculus/projects/SDL-icculus/src/audio/sndio/SDL_sndioaudio.c
   0.2%  20.9Ki   0.0%     673    /home/icculus/projects/SDL-icculus/src/video/offscreen/SDL_offscreenvideo.c
   0.2%  20.7Ki   0.3%  5.12Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_bmp.c
   0.2%  20.5Ki   0.1%  1.63Ki    /home/icculus/projects/SDL-icculus/src/events/SDL_windowevents.c
   0.2%  20.2Ki   0.1%  1.08Ki    /home/icculus/projects/SDL-icculus/src/video/wayland/SDL_waylandopengles.c
   0.2%  20.0Ki   0.4%  7.34Ki    /home/icculus/projects/SDL-icculus/src/render/software/SDL_blendpoint.c
   0.1%  19.3Ki   0.1%  1.84Ki    /home/icculus/projects/SDL-icculus/src/core/linux/SDL_threadprio.c
   0.1%  19.2Ki   0.2%  3.52Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_clipboard.c
   0.1%  19.0Ki   0.1%  2.33Ki    /home/icculus/projects/SDL-icculus/src/timer/SDL_timer.c
   0.1%  19.0Ki   0.0%     278    /home/icculus/projects/SDL-icculus/src/events/SDL_displayevents.c
   0.1%  18.7Ki   0.4%  7.28Ki    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/tablet-unstable-v2-protocol.c
   0.1%  18.6Ki   0.1%  1.82Ki    /home/icculus/projects/SDL-icculus/src/joystick/hidapi/SDL_hidapi_rumble.c
   0.1%  17.7Ki   0.2%  4.00Ki    /home/icculus/projects/SDL-icculus/src/stdlib/SDL_qsort.c
   0.1%  17.4Ki   0.3%  5.49Ki    /home/icculus/projects/SDL-icculus/src/render/opengles2/SDL_shaders_gles2.c
   0.1%  17.4Ki   0.3%  4.77Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_rect.c
   0.1%  17.1Ki   0.0%       0    
   0.1%  16.5Ki   0.1%  1.51Ki    /home/icculus/projects/SDL-icculus/src/audio/disk/SDL_diskaudio.c
   0.1%  16.3Ki   0.1%  1.98Ki    /home/icculus/projects/SDL-icculus/src/SDL_hints.c
   0.1%  16.2Ki   0.2%  4.01Ki    /home/icculus/projects/SDL-icculus/src/render/software/SDL_rotate.c
   0.1%  15.7Ki   0.1%  1.06Ki    /home/icculus/projects/SDL-icculus/src/video/kmsdrm/SDL_kmsdrmopengles.c
   0.1%  15.6Ki   0.1%  1.49Ki    /home/icculus/projects/SDL-icculus/src/joystick/SDL_steam_virtual_gamepad.c
   0.1%  15.6Ki   0.1%  2.68Ki    /home/icculus/projects/SDL-icculus/src/stdlib/SDL_stdlib.c
   0.1%  15.5Ki   0.1%  1.26Ki    /home/icculus/projects/SDL-icculus/src/core/linux/SDL_system_theme.c
   0.1%  15.5Ki   0.3%  5.34Ki    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/xdg-shell-protocol.c
   0.1%  15.4Ki   0.1%  1.31Ki    /home/icculus/projects/SDL-icculus/src/thread/pthread/SDL_systhread.c
   0.1%  15.3Ki   0.1%  2.39Ki    /home/icculus/projects/SDL-icculus/src/libm/e_pow.c
   0.1%  15.3Ki   0.2%  3.81Ki    /home/icculus/projects/SDL-icculus/src/render/software/SDL_drawline.c
   0.1%  15.1Ki   0.2%  3.58Ki    /home/icculus/projects/SDL-icculus/src/events/imKStoUCS.c
   0.1%  14.9Ki   0.2%  3.03Ki    /home/icculus/projects/SDL-icculus/src/filesystem/SDL_filesystem.c
   0.1%  14.3Ki   0.0%     788    /home/icculus/projects/SDL-icculus/src/events/SDL_dropevents.c
   0.1%  14.3Ki   0.0%     927    /home/icculus/projects/SDL-icculus/src/main/SDL_main_callbacks.c
   0.1%  14.2Ki   0.1%  1.84Ki    /home/icculus/projects/SDL-icculus/src/libm/k_rem_pio2.c
   0.1%  14.0Ki   0.2%  3.83Ki    /home/icculus/projects/SDL-icculus/src/events/SDL_scancode_tables.c
   0.1%  13.9Ki   0.2%  3.14Ki    /home/icculus/projects/SDL-icculus/src/render/SDL_yuv_sw.c
   0.1%  13.6Ki   0.0%     663    /home/icculus/projects/SDL-icculus/src/events/SDL_quit.c
   0.1%  13.4Ki   0.1%  1.08Ki    /home/icculus/projects/SDL-icculus/src/stdlib/SDL_malloc.c
   0.1%  13.3Ki   0.0%     675    /home/icculus/projects/SDL-icculus/src/audio/dummy/SDL_dummyaudio.c
   0.1%  13.2Ki   0.1%  2.60Ki    /home/icculus/projects/SDL-icculus/src/audio/SDL_audioqueue.c
   0.1%  13.1Ki   0.0%     524    /home/icculus/projects/SDL-icculus/src/camera/dummy/SDL_camera_dummy.c
   0.1%  13.0Ki   0.1%  1.39Ki    /home/icculus/projects/SDL-icculus/src/audio/SDL_mixer.c
   0.1%  12.8Ki   0.0%     523    /home/icculus/projects/SDL-icculus/src/video/offscreen/SDL_offscreenwindow.c
   0.1%  12.5Ki   0.2%  2.74Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_fillrect.c
   0.1%  12.3Ki   0.1%  2.71Ki    /home/icculus/projects/SDL-icculus/src/filesystem/unix/SDL_sysfilesystem.c
   0.1%  12.3Ki   0.1%  2.02Ki    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/primary-selection-unstable-v1-protocol.c
   0.1%  12.2Ki   0.1%  1.29Ki    /home/icculus/projects/SDL-icculus/src/libm/e_rem_pio2.c
   0.1%  11.9Ki   0.0%     253    /home/icculus/projects/SDL-icculus/src/video/offscreen/SDL_offscreenopengles.c
   0.1%  11.9Ki   0.0%       0    [section .symtab]
   0.1%  11.9Ki   0.0%     803    /home/icculus/projects/SDL-icculus/src/video/offscreen/SDL_offscreenframebuffer.c
   0.1%  11.9Ki   0.1%  1.79Ki    /home/icculus/projects/SDL-icculus/src/events/SDL_keysym_to_scancode.c
   0.1%  11.8Ki   0.0%     683    /home/icculus/projects/SDL-icculus/src/video/dummy/SDL_nullframebuffer.c
   0.1%  11.8Ki   0.0%     751    /home/icculus/projects/SDL-icculus/src/sensor/dummy/SDL_dummysensor.c
   0.1%  11.6Ki   0.1%  1.84Ki    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/pointer-constraints-unstable-v1-protocol.c
   0.1%  11.3Ki   0.1%  1.88Ki    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/text-input-unstable-v3-protocol.c
   0.1%  11.0Ki   0.0%     541    /home/icculus/projects/SDL-icculus/src/timer/unix/SDL_systimer.c
   0.1%  10.9Ki   0.0%     441    /home/icculus/projects/SDL-icculus/src/main/generic/SDL_sysmain_callbacks.c
   0.1%  10.7Ki   0.0%     771    /home/icculus/projects/SDL-icculus/src/libm/s_atan.c
   0.1%  10.6Ki   0.1%  1.21Ki    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/xdg-activation-v1-protocol.c
   0.1%  10.6Ki   0.0%     386    /home/icculus/projects/SDL-icculus/src/thread/pthread/SDL_systls.c
   0.1%  10.5Ki   0.1%    1016    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/input-timestamps-unstable-v1-protocol.c
   0.1%  10.5Ki   0.1%  1.05Ki    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/xdg-output-unstable-v1-protocol.c
   0.1%  10.4Ki   0.0%      41    /home/icculus/projects/SDL-icculus/src/video/kmsdrm/SDL_kmsdrmevents.c
   0.1%  10.4Ki   0.0%     792    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/keyboard-shortcuts-inhibit-unstable-v1-protocol.c
   0.1%  10.4Ki   0.0%     880    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/xdg-decoration-unstable-v1-protocol.c
   0.1%  10.2Ki   0.0%     728    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/relative-pointer-unstable-v1-protocol.c
   0.1%  10.1Ki   0.0%     696    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/fractional-scale-v1-protocol.c
   0.1%  10.1Ki   0.1%  2.29Ki    /home/icculus/projects/SDL-icculus/src/video/wayland/SDL_waylandmessagebox.c
   0.1%  10.0Ki   0.0%     362    /home/icculus/projects/SDL-icculus/src/power/SDL_power.c
   0.1%  10.0Ki   0.0%     784    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/viewporter-protocol.c
   0.1%  9.90Ki   0.0%     568    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/idle-inhibit-unstable-v1-protocol.c
   0.1%  9.57Ki   0.0%     424    /home/icculus/projects/SDL-icculus/buildbot/wayland-generated-protocols/kde-output-order-v1-protocol.c
   0.1%  9.14Ki   0.0%     149    /home/icculus/projects/SDL-icculus/src/events/SDL_clipboardevents.c
   0.1%  8.97Ki   0.1%  1.48Ki    /home/icculus/projects/SDL-icculus/src/video/SDL_blit.c
   0.1%  8.78Ki   0.1%  1.72Ki    /home/icculus/projects/SDL-icculus/src/SDL_hashtable.c
   0.1%  8.67Ki   0.0%      37    /home/icculus/projects/SDL-icculus/src/video/offscreen/SDL_offscreenevents.c
   0.1%  8.67Ki   0.0%      37    /home/icculus/projects/SDL-icculus/src/video/dummy/SDL_nullevents.c
   0.1%  8.12Ki   0.0%       0    [section .strtab]
   0.1%  7.99Ki   0.1%  1.33Ki    /home/icculus/projects/SDL-icculus/src/filesystem/posix/SDL_sysfsops.c
   0.1%  7.65Ki   0.0%     843    /home/icculus/projects/SDL-icculus/src/core/linux/SDL_ime.c
   0.1%  7.21Ki   0.1%  2.17Ki    /home/icculus/projects/SDL-icculus/src/render/SDL_d3dmath.c
   0.1%  7.03Ki   0.0%     680    /home/icculus/projects/SDL-icculus/src/video/SDL_blit_copy.c
   0.1%  6.76Ki   0.4%  6.76Ki    [section .gnu.hash]
   0.0%  6.39Ki   0.1%    1012    /home/icculus/projects/SDL-icculus/src/core/SDL_core_unsupported.c
   0.0%  6.06Ki   0.1%     978    /home/icculus/projects/SDL-icculus/src/thread/pthread/SDL_syscond.c
   0.0%  5.52Ki   0.1%     991    /home/icculus/projects/SDL-icculus/src/thread/pthread/SDL_syssem.c
   0.0%  5.46Ki   0.3%  5.46Ki    [section .dynsym]
   0.0%  5.27Ki   0.0%     177    /home/icculus/projects/SDL-icculus/src/joystick/steam/SDL_steamcontroller.c
   0.0%  5.07Ki   0.0%     606    /home/icculus/projects/SDL-icculus/src/stdlib/SDL_iconv.c
   0.0%  5.01Ki   0.0%     907    /home/icculus/projects/SDL-icculus/src/SDL_error.c
   0.0%  5.00Ki   0.0%     660    /home/icculus/projects/SDL-icculus/src/render/software/SDL_drawpoint.c
   0.0%  4.80Ki   0.0%     478    /home/icculus/projects/SDL-icculus/src/thread/pthread/SDL_sysrwlock.c
   0.0%  4.78Ki   0.0%     453    /home/icculus/projects/SDL-icculus/src/thread/pthread/SDL_sysmutex.c
   0.0%  4.45Ki   0.0%       0    [Unmapped]
   0.0%  4.09Ki   0.0%     661    /home/icculus/projects/SDL-icculus/src/locale/unix/SDL_syslocale.c
   0.0%  4.07Ki   0.0%     627    /home/icculus/projects/SDL-icculus/src/libm/e_exp.c
   0.0%  4.02Ki   0.0%     763    /home/icculus/projects/SDL-icculus/src/libm/e_fmod.c
   0.0%  3.97Ki   0.0%     432    /home/icculus/projects/SDL-icculus/src/SDL_guid.c
   0.0%  3.87Ki   0.0%     504    /home/icculus/projects/SDL-icculus/src/loadso/dlopen/SDL_sysloadso.c
   0.0%  3.84Ki   0.0%     691    /home/icculus/projects/SDL-icculus/src/libm/k_tan.c
   0.0%  3.83Ki   0.0%     700    /home/icculus/projects/SDL-icculus/src/libm/e_log.c
   0.0%  3.70Ki   0.0%     551    /home/icculus/projects/SDL-icculus/src/libm/e_atan2.c
   0.0%  3.67Ki   0.0%     423    /home/icculus/projects/SDL-icculus/src/locale/SDL_locale.c
   0.0%  3.52Ki   0.2%  3.52Ki    [section .plt]
   0.0%  3.50Ki   0.2%  3.50Ki    [section .plt.sec]
   0.0%  3.41Ki   0.0%     348    /home/icculus/projects/SDL-icculus/src/atomic/SDL_atomic.c
   0.0%  3.38Ki   0.0%     471    /home/icculus/projects/SDL-icculus/src/libm/e_sqrt.c
   0.0%  3.31Ki   0.0%     421    /home/icculus/projects/SDL-icculus/src/misc/unix/SDL_sysurl.c
   0.0%  3.25Ki   0.0%     378    /home/icculus/projects/SDL-icculus/src/libm/s_scalbn.c
   0.0%  3.20Ki   0.0%     346    /home/icculus/projects/SDL-icculus/src/SDL_list.c
   0.0%  3.17Ki   0.0%     419    /home/icculus/projects/SDL-icculus/src/core/linux/SDL_evdev_capabilities.c
   0.0%  3.15Ki   0.0%     265    /home/icculus/projects/SDL-icculus/src/video/SDL_video_unsupported.c
   0.0%  3.11Ki   0.1%  1.23Ki    /home/icculus/projects/SDL-icculus/src/core/unix/SDL_appid.c
   0.0%  2.97Ki   0.0%     281    /home/icculus/projects/SDL-icculus/src/core/unix/SDL_poll.c
   0.0%  2.90Ki   0.0%     297    /home/icculus/projects/SDL-icculus/src/libm/e_log10.c
   0.0%  2.79Ki   0.0%     236    /home/icculus/projects/SDL-icculus/src/libm/s_sin.c
   0.0%  2.77Ki   0.0%     234    /home/icculus/projects/SDL-icculus/src/libm/s_cos.c
   0.0%  2.74Ki   0.0%     322    /home/icculus/projects/SDL-icculus/src/libm/s_floor.c
   0.0%  2.70Ki   0.0%     215    /home/icculus/projects/SDL-icculus/src/libm/s_modf.c
   0.0%  2.67Ki   0.0%     229    /home/icculus/projects/SDL-icculus/src/SDL_utils.c
   0.0%  2.67Ki   0.0%     298    /home/icculus/projects/SDL-icculus/src/libm/k_cos.c
   0.0%  2.63Ki   0.0%     267    /home/icculus/projects/SDL-icculus/src/libm/k_sin.c
   0.0%  2.63Ki   0.0%     196    /home/icculus/projects/SDL-icculus/src/libm/s_tan.c
   0.0%  2.56Ki   0.0%       0    [ELF Section Headers]
   0.0%  2.46Ki   0.1%  2.46Ki    [section .data.rel.ro]
   0.0%  2.43Ki   0.0%     183    /home/icculus/projects/SDL-icculus/src/atomic/SDL_spinlock.c
   0.0%  2.36Ki   0.0%     245    /home/icculus/projects/SDL-icculus/src/core/linux/SDL_sandbox.c
   0.0%  2.34Ki   0.0%     182    /home/icculus/projects/SDL-icculus/src/stdlib/SDL_getenv.c
   0.0%  2.34Ki   0.0%     155    /home/icculus/projects/SDL-icculus/src/stdlib/SDL_memset.c
   0.0%  2.33Ki   0.0%     101    /home/icculus/projects/SDL-icculus/src/stdlib/SDL_crc16.c
   0.0%  2.31Ki   0.1%  2.31Ki    [section .gnu.version]
   0.0%  2.29Ki   0.0%     106    /home/icculus/projects/SDL-icculus/src/stdlib/SDL_crc32.c
   0.0%  2.16Ki   0.0%      95    /home/icculus/projects/SDL-icculus/src/core/SDL_runapp.c
   0.0%  2.16Ki   0.0%      86    /home/icculus/projects/SDL-icculus/src/libm/s_copysign.c
   0.0%  2.05Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/video/yuv2rgb/yuv_rgb_lsx.c
   0.0%  2.02Ki   0.0%      52    /home/icculus/projects/SDL-icculus/src/libm/s_fabs.c
   0.0%  1.96Ki   0.0%      73    /home/icculus/projects/SDL-icculus/src/misc/SDL_url.c
   0.0%  1.92Ki   0.1%  1.92Ki    [section .dynstr]
   0.0%  1.90Ki   0.0%      41    /home/icculus/projects/SDL-icculus/src/stdlib/SDL_strtokr.c
   0.0%  1.88Ki   0.0%      41    /home/icculus/projects/SDL-icculus/src/stdlib/SDL_memmove.c
   0.0%  1.81Ki   0.0%      45    /home/icculus/projects/SDL-icculus/src/stdlib/SDL_memcpy.c
   0.0%  1.77Ki   0.1%  1.77Ki    [section .got.plt]
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/audio/SDL_audiodev.c
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/render/SDL_render_unsupported.c
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/render/direct3d/SDL_render_d3d.c
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/render/direct3d/SDL_shaders_d3d.c
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/render/direct3d11/SDL_render_d3d11.c
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/render/direct3d11/SDL_shaders_d3d11.c
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/render/direct3d12/SDL_render_d3d12.c
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/render/direct3d12/SDL_shaders_d3d12.c
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/render/ps2/SDL_render_ps2.c
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/render/psp/SDL_render_psp.c
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/render/vitagxm/SDL_render_vita_gxm.c
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/render/vitagxm/SDL_render_vita_gxm_memory.c
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/render/vitagxm/SDL_render_vita_gxm_tools.c
   0.0%  1.45Ki   0.0%       0    /home/icculus/projects/SDL-icculus/src/stdlib/SDL_mslibc.c
   0.0%     655   0.0%     655    [section .rodata]
   0.0%     616   0.0%     616    [ELF Program Headers]
   0.0%     564   0.0%     564    [section .data]
   0.0%     512   0.0%     512    [section .dynamic]
   0.0%     422   0.0%       0    [section .shstrtab]
   0.0%     272   0.0%     272    [section .gnu.version_r]
   0.0%     212   0.0%     212    [section .text]
   0.0%     136   0.0%     136    [section .eh_frame]
   0.0%     120   0.0%     120    [section .rela.dyn]
   0.0%      64   0.0%      64    [ELF Header]
   0.0%      56   0.0%      56    [section .gnu.version_d]
   0.0%      43   0.0%       0    [section .comment]
   0.0%      36   0.0%      36    [section .eh_frame_hdr]
   0.0%      36   0.0%      36    [section .note.gnu.build-id]
   0.0%      32   0.0%      32    [section .got]
   0.0%      32   0.0%      32    [section .note.gnu.property]
   0.0%      27   0.0%      27    [section .init]
   0.0%      24   0.0%      24    [section .rela.plt]
   0.0%      16   0.0%       0    [section .debug_aranges]
   0.0%      16   0.0%      16    [section .plt.got]
   0.0%      13   0.0%      13    [section .fini]
   0.0%       8   0.0%       8    [section .fini_array]
   0.0%       8   0.0%       8    [section .init_array]
   0.0%       7   0.0%       7    [LOAD #1 [RX]]
   0.0%       6   0.0%       6    [LOAD #0 [R]]
   0.0%       0   0.0%     352    [section .bss]
 100.0%  12.9Mi 100.0%  1.77Mi    TOTAL

Disregarding the debug tables, the dynapi is 5% of the disk space! And 10% of the runtime memory! Probably for all the wrapper functions.

icculus commented 4 months ago

the dynapi is 5% of the disk space! And 10% of the runtime memory! Probably for all the wrapper functions!

So this is what this looks like, just to dive a little deeper.

Here's a random function (SDL_OpenAudioDevice). It finds itself in the dynapi jump table and calls through to the actual function:

000000000005ff11 <SDL_OpenAudioDevice>:
   5ff11:       f3 0f 1e fa             endbr64
   5ff15:       55                      push   %rbp
   5ff16:       48 89 e5                mov    %rsp,%rbp
   5ff19:       48 83 ec 10             sub    $0x10,%rsp
   5ff1d:       89 7d fc                mov    %edi,-0x4(%rbp)
   5ff20:       48 89 75 f0             mov    %rsi,-0x10(%rbp)
   5ff24:       48 8b 0d 7d 50 2c 00    mov    0x2c507d(%rip),%rcx        # 324fa8 <jump_table+0x19a8>
   5ff2b:       48 8b 55 f0             mov    -0x10(%rbp),%rdx
   5ff2f:       8b 45 fc                mov    -0x4(%rbp),%eax
   5ff32:       48 89 d6                mov    %rdx,%rsi
   5ff35:       89 c7                   mov    %eax,%edi
   5ff37:       ff d1                   call   *%rcx
   5ff39:       c9                      leave
   5ff3a:       c3                      ret

...in C, this is a macro:

#define SDL_DYNAPI_PROC(rc,fn,params,args,ret) \
    rc SDLCALL fn params { ret jump_table.fn args; }

(then it includes SDL_dynapi_procs.h to generate all those entry points.)

This could be dramatically smaller if we simply found the function pointer in the jump table at a fixed offset and literally JMP'd to it. We don't need to manage function parameters, return values, or the stack; they're already all in place from the actual caller and the actual called function.

One of these isn't a big reduction (the above code, assembled, is 42 bytes), but there are almost 1000 entry points in SDL3 now, and it adds up.

The reason we don't do this is because it probably needs assembly code for each platform/compiler/etc and the C code is simple, but that's where a lot of space is going. Maybe there's a compiler-specific way to convince the compiler to perform a tail-call optimization?

The other thing is that each entry point also has a _DEFAULT version, which decides how to set up the jump table:

0000000000055a76 <SDL_OpenAudioDevice_DEFAULT>:
   55a76:       f3 0f 1e fa             endbr64
   55a7a:       55                      push   %rbp
   55a7b:       48 89 e5                mov    %rsp,%rbp
   55a7e:       48 83 ec 10             sub    $0x10,%rsp
   55a82:       89 7d fc                mov    %edi,-0x4(%rbp)
   55a85:       48 89 75 f0             mov    %rsi,-0x10(%rbp)
   55a89:       e8 c0 00 01 00          call   65b4e <SDL_InitDynamicAPI>
   55a8e:       48 8b 0d 13 f5 2c 00    mov    0x2cf513(%rip),%rcx        # 324fa8 <jump_table+0x19a8>
   55a95:       48 8b 55 f0             mov    -0x10(%rbp),%rdx
   55a99:       8b 45 fc                mov    -0x4(%rbp),%eax
   55a9c:       48 89 d6                mov    %rdx,%rsi
   55a9f:       89 c7                   mov    %eax,%edi
   55aa1:       ff d1                   call   *%rcx
   55aa3:       c9                      leave
   55aa4:       c3                      ret

Literally this macro in C:

#define SDL_DYNAPI_PROC(rc,fn,params,args,ret) \
    static rc SDLCALL fn##_DEFAULT params { \
        SDL_InitDynamicAPI(); \
        ret jump_table.fn args; \
    }

Every entry point has one of these, with the idea that no matter what SDL function you first call, it will set up the correct jump table, then make the correct call into it. Once any one of these _DEFAULT functions run, all of them are replaced in the jump table with the _REAL functions (the actual code we wrote for SDL), either from an external copy of SDL or more usually from the same library that ran its own _DEFAULT code.

This can't resolve in a shared library constructor/init/whatever function, because this needs to work even if SDL is statically linked to the app...but that would be the easiest way to reduce bloat, because all almost-1000 _DEFAULT functions could be eliminated completely.

I don't have immediate solutions to solve this, but I also want to note that for almost everything but desktop OSes, we don't use the dynamic API and all of it evaporates out through some macro magic, so this bloat is only a problem on desktop Windows, Linux, and macOS, which are the platforms that can most afford the extra bulk.

icculus commented 4 months ago

As for yuv_rgb_sse.c, the next biggest thing, it's not hard to figure out why:

https://github.com/libsdl-org/SDL/blob/dbdc65fc955eeaf7100dbe2f10f325720225b850/src/video/yuv2rgb/yuv_rgb_sse.c#L138-L244

slouken commented 4 months ago

As for yuv_rgb_sse.c, the next biggest thing, it's not hard to figure out why:

https://github.com/libsdl-org/SDL/blob/dbdc65fc955eeaf7100dbe2f10f325720225b850/src/video/yuv2rgb/yuv_rgb_sse.c#L138-L244

At least that one goes away if you define SDL_LEAN_AND_MEAN.

icculus commented 4 months ago

This patch drops the dynapi from this...

   5.0%   667Ki  10.8%   195Ki    /home/icculus/projects/SDL-icculus/src/dynapi/SDL_dynapi.c

...to this...

   3.1%   397Ki   6.4%   111Ki    /home/icculus/projects/SDL-icculus/src/dynapi/SDL_dynapi.c

Almost half!

This can't land in revision control as-is, as it relies on GCC's __attribute__((constructor)) to work, but it allows us to delete the _DEFAULT functions completely and guarantee we'll handle the dynapi init once, when the binary is loaded, either as a shared library or statically linked to the app.

If nothing else, after thinking through the ramifications of this change, we should maybe #ifdef this trick in where it can work (Linux, maybe macOS?), falling back to the existing _DEFAULT magic on platforms without an equivalent of that __attribute__((constructor)) magic.

diff --git a/src/dynapi/SDL_dynapi.c b/src/dynapi/SDL_dynapi.c
index 300ddaba1..2f6d6d67c 100644
--- a/src/dynapi/SDL_dynapi.c
+++ b/src/dynapi/SDL_dynapi.c
@@ -57,8 +57,6 @@
 extern "C" {
 #endif

-static void SDL_InitDynamicAPI(void);
-
 /* BE CAREFUL CALLING ANY SDL CODE IN HERE, IT WILL BLOW UP.
    Even self-contained stuff might call SDL_Error and break everything. */

@@ -181,11 +179,9 @@ static void SDL_InitDynamicAPI(void);
 #endif

 /* Typedefs for function pointers for jump table, and predeclare funcs */
-/* The DEFAULT funcs will init jump table and then call real function. */
 /* The REAL funcs are the actual functions, name-mangled to not clash. */
 #define SDL_DYNAPI_PROC(rc, fn, params, args, ret) \
     typedef rc (SDLCALL *SDL_DYNAPIFN_##fn) params;\
-    static rc SDLCALL fn##_DEFAULT params;         \
     extern rc SDLCALL fn##_REAL params;
 #include "SDL_dynapi_procs.h"
 #undef SDL_DYNAPI_PROC
@@ -198,36 +194,13 @@ typedef struct
 #undef SDL_DYNAPI_PROC
 } SDL_DYNAPI_jump_table;

-/* Predeclare the default functions for initializing the jump table. */
-#define SDL_DYNAPI_PROC(rc, fn, params, args, ret) static rc SDLCALL fn##_DEFAULT params;
-#include "SDL_dynapi_procs.h"
-#undef SDL_DYNAPI_PROC
-
 /* The actual jump table. */
 static SDL_DYNAPI_jump_table jump_table = {
-#define SDL_DYNAPI_PROC(rc, fn, params, args, ret) fn##_DEFAULT,
+#define SDL_DYNAPI_PROC(rc, fn, params, args, ret) fn##_REAL,
 #include "SDL_dynapi_procs.h"
 #undef SDL_DYNAPI_PROC
 };

-/* Default functions init the function table then call right thing. */
-#if DISABLE_JUMP_MAGIC
-#define SDL_DYNAPI_PROC(rc, fn, params, args, ret) \
-    static rc SDLCALL fn##_DEFAULT params          \
-    {                                              \
-        SDL_InitDynamicAPI();                      \
-        ret jump_table.fn args;                    \
-    }
-#define SDL_DYNAPI_PROC_NO_VARARGS 1
-#include "SDL_dynapi_procs.h"
-#undef SDL_DYNAPI_PROC
-#undef SDL_DYNAPI_PROC_NO_VARARGS
-SDL_DYNAPI_VARARGS(static, _DEFAULT, SDL_InitDynamicAPI())
-#else
-/* !!! FIXME: need the jump magic. */
-#error Write me.
-#endif
-
 /* Public API functions to jump into the jump table. */
 #if DISABLE_JUMP_MAGIC
 #define SDL_DYNAPI_PROC(rc, fn, params, args, ret) \
@@ -371,17 +344,9 @@ static Sint32 initialize_jumptable(Uint32 apiver, void *table, Uint32 tablesize)
         if (log_calls) {
 #define SDL_DYNAPI_PROC(rc, fn, params, args, ret) jump_table.fn = fn##_LOGSDLCALLS;
 #include "SDL_dynapi_procs.h"
-#undef SDL_DYNAPI_PROC
-        } else {
-#define SDL_DYNAPI_PROC(rc, fn, params, args, ret) jump_table.fn = fn##_REAL;
-#include "SDL_dynapi_procs.h"
 #undef SDL_DYNAPI_PROC
         }
     }
-#else
-#define SDL_DYNAPI_PROC(rc, fn, params, args, ret) jump_table.fn = fn##_REAL;
-#include "SDL_dynapi_procs.h"
-#undef SDL_DYNAPI_PROC
 #endif

     /* Then the external table... */
@@ -474,7 +439,7 @@ extern SDL_NORETURN void SDL_ExitProcess(int exitcode);
 }
 #endif

-static void SDL_InitDynamicAPILocked(void)
+__attribute__((constructor)) static void SDL_InitDynamicAPI(void)
 {
     char *libname = SDL_getenv_REAL(SDL_DYNAMIC_API_ENVVAR);
     SDL_DYNAPI_ENTRYFN entry = NULL; /* funcs from here by default. */
@@ -523,32 +488,6 @@ static void SDL_InitDynamicAPILocked(void)
     /* we intentionally never close the newly-loaded lib, of course. */
 }

-static void SDL_InitDynamicAPI(void)
-{
-    /* So the theory is that every function in the jump table defaults to
-     *  calling this function, and then replaces itself with a version that
-     *  doesn't call this function anymore. But it's possible that, in an
-     *  extreme corner case, you can have a second thread hit this function
-     *  while the jump table is being initialized by the first.
-     * In this case, a spinlock is really painful compared to what spinlocks
-     *  _should_ be used for, but this would only happen once, and should be
-     *  insanely rare, as you would have to spin a thread outside of SDL (as
-     *  SDL_CreateThread() would also call this function before building the
-     *  new thread).
-     */
-    static SDL_bool already_initialized = SDL_FALSE;
-
-    static SDL_SpinLock lock = 0;
-    SDL_LockSpinlock_REAL(&lock);
-
-    if (!already_initialized) {
-        SDL_InitDynamicAPILocked();
-        already_initialized = SDL_TRUE;
-    }
-
-    SDL_UnlockSpinlock_REAL(&lock);
-}
-
 #else /* SDL_DYNAMIC_API */

 #include <SDL3/SDL.h>
icculus commented 4 months ago

This doesn't work because I don't know all the asm magic needed, but something like this in a separate .S file probably gets you a 2-instruction entry point that just bounces through the jump table on x86-64.

This is probably not a nightmare we want to sign up for, though.

#define SDL_DYNAPI_PROC(rc, fn, params, args, ret) fn: mov jump_table+__COUNTER__*8, %rcx ; jmp *%rcx
#include "SDL_dynapi_procs.h"
#undef SDL_DYNAPI_PROC
madebr commented 4 months ago

I tried building with llvm 16, which does the tail call optimization by default (no need for __attribute__((musttail))). Even with these tail calls binary size of SDL_dynapi.c is comparable, if not higher, to gcc.

0000000000045a70 <SDL_OpenAudioDevice>:
   45a70:       48 8b 05 91 56 1c 00    mov    rax,QWORD PTR [rip+0x1c5691]        # 20b108 <jump_table+0x19a0>
   45a77:       ff e0                   jmp    rax

What's bloating the binary size are the *_DEFAULT functions. The call to SDL_InitDynamicAPI requires storing all arguments on the stack.

00000000000527f0 <SDL_OpenAudioDeviceStream_DEFAULT>:
   527f0:       55                      push   rbp
   527f1:       41 57                   push   r15
   527f3:       41 56                   push   r14
   527f5:       53                      push   rbx
   527f6:       50                      push   rax
   527f7:       48 89 cb                mov    rbx,rcx
   527fa:       49 89 d6                mov    r14,rdx
   527fd:       49 89 f7                mov    r15,rsi
   52800:       89 fd                   mov    ebp,edi
   52802:       e8 39 15 00 00          call   53d40 <SDL_InitDynamicAPI>
   52807:       48 8b 05 92 89 1b 00    mov    rax,QWORD PTR [rip+0x1b8992]        # 20b1a0 <jump_table+0x1a38>
   5280e:       89 ef                   mov    edi,ebp
   52810:       4c 89 fe                mov    rsi,r15
   52813:       4c 89 f2                mov    rdx,r14
   52816:       48 89 d9                mov    rcx,rbx
   52819:       48 83 c4 08             add    rsp,0x8
   5281d:       5b                      pop    rbx
   5281e:       41 5e                   pop    r14
   52820:       41 5f                   pop    r15
   52822:       5d                      pop    rbp
   52823:       ff e0                   jmp    rax

So avoiding these functions would reduce the file size considerably!

If this is really a concern, we can do this for shared libraries:

madebr commented 4 months ago

https://github.com/madebr/SDL/commit/1fc355bb0d566ea72c499334903c3548d401ab80 implements my last proposal. It reduces a 32-bit MSVC SDL3.dll by 33 kiB. Not that impressive.

what before after delta
VC x86 2,563,584 2,596,864 -33,240
VC x64 3,067,904 2,987,008 -80,896
icculus commented 4 months ago

Not that impressive

It's 2.5% of the total binary size in the 64-bit case. Get the tail-call magic in there too and you probably have a 5% total reduction, which is pretty darn good for the effort!

madebr commented 4 months ago

Removing the *_DEFAULT symbols on windows, means initializing the jump table in DllMain. This also means doing LoadLibrary in DllMain, which msdn advises not to do. I don't think the risk of having a deadlock is no t increased when SDL3.dll loads its dependencies as much as possible dynamically.

https://github.com/madebr/SDL/commit/09b477c8ed5a6aecdf4672a84dacf59e3cdd5b70

slouken commented 4 months ago

This also means doing LoadLibrary in DllMain, which msdn advises not to do.

This seems like a terrible idea.

sezero commented 4 months ago

This also means doing LoadLibrary in DllMain, which msdn advises not to do.

This seems like a terrible idea.

Which is what we are already doing in sdl12-compat and sdl2-compat...

slouken commented 4 months ago

Moving along... :)

AntTheAlchemist commented 4 months ago

Can we push more things into SDL_LEAN_AND_MEAN ?

And is everything wrapped in SDL_LEAN_AND_MEAN properly? While paging through code, I noticed SDL_SW_RenderGeometryRaw isn't excluded?

slouken commented 4 months ago

Yes, and I think now's your opportunity to finally learn how to do PRs :)

Feel free to suggest PRs wrapping more functionality in SDL_LEAN_AND_MEAN. The goal should be to reduce binary size while keeping API functionality, only removing things that are not likely to be used by most applications. It should not overlap disabling subsystems, which can already be done independently. Ideally, each PR would show the before and after size so we can evaluate whether the additional code complexity / functionality reduction is worth the trade-off.

AntTheAlchemist commented 4 months ago

I think now's your opportunity to finally learn how to do PRs :)

You're a hard task-master - but that's fair, lol. I'll have a crack at it.

slouken commented 4 months ago

GitHub actually makes it pretty easy. You can just click the “fork” button on the SDL repo to create your own copy which you can clone to your local system.

My usual workflow is: Go to github and sync my fork git pull git branch some-branch git checkout some-branch Make edits, build and test git commit -a git push Wait for CI to give me build errors Fix the errors locally git commit -a git rebase -i origin (use ‘f’ to compress the new commit into the original one) git push -f Wait for CI to succeed Go to GitHub and create a pull request from my branch git checkout main and I’m done

madebr commented 4 months ago

Perhaps the biggest non-trivial thing about committing to GitHub is configuring your SSH keys.

On my (Linux) system, the private keys are stored in ~/.ssh. The private key is configured in ~/.ssh/config by this item:

Host github.com
    IdentityFile ~/.ssh/github_madebr_private_key
    IdentitiesOnly yes
    User madebr
AntTheAlchemist commented 4 months ago

Assume I've just travelled in time from the 80s. The entire process is so alien to me, so the learning curve is massive. I've never heard of SSH keys or CLs. I've never installed any GIT related software. None of these steps are trivial. I think the last time I tried to learn, the instructions assumed I was using Linux and I'm only using Windows, so I couldn't proceed. I found it easier to download the source as a .zip and build using the Visual Studio project. I've been asking various chat bots to help me with the process and none of it makes sense.

slouken commented 4 months ago

Chat bots are dumb. Send me e-mail at slouken@libsdl.org and we can coordinate a time to chat through this.

ccawley2011 commented 4 months ago

As for yuv_rgb_sse.c, the next biggest thing, it's not hard to figure out why:

https://github.com/libsdl-org/SDL/blob/dbdc65fc955eeaf7100dbe2f10f325720225b850/src/video/yuv2rgb/yuv_rgb_sse.c#L138-L244

Something that might help would be to conditionally enable the fast paths for blitting based on which video and render drivers are enabled. That way, it would be possible to disable any blitters where the destination format isn't supported as a texture format or a framebuffer format.