libretro / desmume2015

Port of Desmume to libretro based on Desmume SVN circa 2015.
30 stars 44 forks source link

Desmume libretro core runs much slower than standalone Desmume #14

Open chrisacheson opened 10 years ago

chrisacheson commented 10 years ago

Testing with Castlevania: Portrait of Ruin on a 2.93ghz Intel Core2 Duo with 4gb RAM. The standalone version of Desmume runs at 40-45fps in interpreted mode, 70-75fps in JIT mode (with the fps limit disabled). Libretro runs at 25-30fps in interpeted mode, 30-35fps in JIT mode, and has choppy/echoing sound.

These are all in-game, while I'm actively moving around and attacking. I tested with Desmume 0.9.9 from the Ubuntu repo, the latest Desmume SVN (r5046), libretro-desmume from hunterk's PPA, and the latest version from this repo (999558b952dec28c0ca281ea4a19e2a71e2c6664).

Monroe88 commented 10 years ago

Make sure Hard GPU Sync is either off or Hard GPU Sync Frames is set to 1 or higher, as that feature is quite demanding at 0 frames and can cause cores that only run a bit above 60fps to drop down below fullspeed.

Also, disable rewind because it massively slows down this core, makes it run about half speed even with the JIT enabled on my end.

chrisacheson commented 10 years ago

Hard GPU Sync and Rewind are both off. In case it can shed any light on the situation, here's my retroarch.cfg (with the button mappings for players 2-8 removed to cut it down a bit):

config_save_on_exit = "true"
input_axis_threshold = "0.500000"
load_dummy_on_core_shutdown = "true"
fps_show = "false"
rewind_enable = "false"
audio_latency = "64"
audio_sync = "true"
audio_block_frames = "0"
rewind_granularity = "1"
video_shader_enable = "true"
video_aspect_ratio = "-1.000000"
video_windowed_fullscreen = "true"
video_xscale = "3.000000"
autosave_interval = "0"
video_yscale = "3.000000"
video_crop_overscan = "true"
video_scale_integer = "false"
video_smooth = "false"
video_threaded = "false"
video_shared_context = "false"
video_fullscreen = "true"
video_refresh_rate = "59.950001"
video_monitor_index = "0"
video_fullscreen_x = "0"
video_fullscreen_y = "0"
video_driver = "gl"
menu_driver = "rgui"
video_vsync = "true"
video_hard_sync = "false"
video_hard_sync_frames = "0"
video_black_frame_insertion = "false"
video_disable_composition = "false"
pause_nonactive = "false"
video_swap_interval = "1"
video_gpu_screenshot = "true"
video_rotation = "0"
screenshot_directory = "default"
aspect_ratio_index = "6"
audio_rate_control = "true"
audio_rate_control_delta = "0.005000"
audio_driver = "alsa"
audio_out_rate = "48000"
video_font_size = "32.000000"
video_font_enable = "true"
system_directory = "~/.config/retroarch/system"
audio_resampler = "sinc"
savefile_directory = "default"
savestate_directory = "default"
video_shader_dir = "default"
video_filter_dir = "default"
audio_filter_dir = "default"
content_directory = "default"
assets_directory = "default"
rgui_browser_directory = "/storage/roms"
rgui_config_directory = "default"
rgui_show_start_screen = "false"
game_history_size = "100"
input_autodetect_enable = "true"
overlay_directory = "default"
input_overlay_enable = "false"
input_overlay_opacity = "0.700000"
input_overlay_scale = "1.000000"
gamma_correction = "false"
triple_buffering_enable = "false"
soft_filter_enable = "false"
flicker_filter_enable = "false"
flicker_filter_index = "0"
soft_filter_index = "0"
current_resolution_id = "0"
custom_viewport_width = "941"
custom_viewport_height = "706"
custom_viewport_x = "9"
custom_viewport_y = "0"
block_sram_overwrite = "false"
savestate_auto_index = "false"
savestate_auto_save = "false"
savestate_auto_load = "true"
fastforward_ratio = "-1.000000"
slowmotion_ratio = "3.000000"
sound_mode = "0"
state_slot = "0"
netplay_spectator_mode_enable = "false"
netplay_mode = "false"
netplay_ip_port = "0"
netplay_delay_frames = "0"
custom_bgm_enable = "false"
input_driver = "udev"
input_device_p1 = "0"
input_player1_joypad_index = "0"
input_libretro_device_p1 = "5"
input_player1_analog_dpad_mode = "0"
input_device_p2 = "0"
input_player2_joypad_index = "1"
input_libretro_device_p2 = "1"
input_player2_analog_dpad_mode = "0"
input_device_p3 = "0"
input_player3_joypad_index = "2"
input_libretro_device_p3 = "1"
input_player3_analog_dpad_mode = "0"
input_device_p4 = "0"
input_player4_joypad_index = "3"
input_libretro_device_p4 = "1"
input_player4_analog_dpad_mode = "0"
input_device_p5 = "0"
input_player5_joypad_index = "4"
input_libretro_device_p5 = "1"
input_player5_analog_dpad_mode = "0"
input_device_p6 = "0"
input_player6_joypad_index = "5"
input_libretro_device_p6 = "1"
input_player6_analog_dpad_mode = "0"
input_device_p7 = "0"
input_player7_joypad_index = "6"
input_libretro_device_p7 = "1"
input_player7_analog_dpad_mode = "0"
input_device_p8 = "0"
input_player8_joypad_index = "7"
input_libretro_device_p8 = "1"
input_player8_analog_dpad_mode = "0"
input_player1_b = "z"
input_player1_b_btn = "2"
input_player1_b_axis = "nul"
input_player1_y = "a"
input_player1_y_btn = "3"
input_player1_y_axis = "nul"
input_player1_select = "rshift"
input_player1_select_btn = "8"
input_player1_select_axis = "nul"
input_player1_start = "enter"
input_player1_start_btn = "11"
input_player1_start_axis = "nul"
input_player1_up = "up"
input_player1_up_btn = "12"
input_player1_up_axis = "nul"
input_player1_down = "down"
input_player1_down_btn = "14"
input_player1_down_axis = "nul"
input_player1_left = "left"
input_player1_left_btn = "15"
input_player1_left_axis = "nul"
input_player1_right = "right"
input_player1_right_btn = "13"
input_player1_right_axis = "nul"
input_player1_a = "x"
input_player1_a_btn = "1"
input_player1_a_axis = "nul"
input_player1_x = "s"
input_player1_x_btn = "0"
input_player1_x_axis = "nul"
input_player1_l = "q"
input_player1_l_btn = "6"
input_player1_l_axis = "nul"
input_player1_r = "w"
input_player1_r_btn = "7"
input_player1_r_axis = "nul"
input_player1_l2 = "nul"
input_player1_l2_btn = "4"
input_player1_l2_axis = "nul"
input_player1_r2 = "nul"
input_player1_r2_btn = "5"
input_player1_r2_axis = "nul"
input_player1_l3 = "nul"
input_player1_l3_btn = "9"
input_player1_l3_axis = "nul"
input_player1_r3 = "nul"
input_player1_r3_btn = "10"
input_player1_r3_axis = "nul"
input_player1_l_x_plus = "nul"
input_player1_l_x_plus_btn = "nul"
input_player1_l_x_plus_axis = "+0"
input_player1_l_x_minus = "nul"
input_player1_l_x_minus_btn = "nul"
input_player1_l_x_minus_axis = "-0"
input_player1_l_y_plus = "nul"
input_player1_l_y_plus_btn = "nul"
input_player1_l_y_plus_axis = "+1"
input_player1_l_y_minus = "nul"
input_player1_l_y_minus_btn = "nul"
input_player1_l_y_minus_axis = "-1"
input_player1_r_x_plus = "nul"
input_player1_r_x_plus_btn = "nul"
input_player1_r_x_plus_axis = "+2"
input_player1_r_x_minus = "nul"
input_player1_r_x_minus_btn = "nul"
input_player1_r_x_minus_axis = "-2"
input_player1_r_y_plus = "nul"
input_player1_r_y_plus_btn = "nul"
input_player1_r_y_plus_axis = "+3"
input_player1_r_y_minus = "nul"
input_player1_r_y_minus_btn = "nul"
input_player1_r_y_minus_axis = "-3"
input_player1_turbo = "nul"
input_player1_turbo_btn = "nul"
input_player1_turbo_axis = "nul"
input_toggle_fast_forward = "space"
input_toggle_fast_forward_btn = "nul"
input_toggle_fast_forward_axis = "nul"
input_hold_fast_forward = "l"
input_hold_fast_forward_btn = "nul"
input_hold_fast_forward_axis = "nul"
input_load_state = "f4"
input_load_state_btn = "nul"
input_load_state_axis = "nul"
input_save_state = "f2"
input_save_state_btn = "nul"
input_save_state_axis = "nul"
input_toggle_fullscreen = "f"
input_toggle_fullscreen_btn = "nul"
input_toggle_fullscreen_axis = "nul"
input_exit_emulator = "escape"
input_exit_emulator_btn = "nul"
input_exit_emulator_axis = "nul"
input_state_slot_increase = "f7"
input_state_slot_increase_btn = "nul"
input_state_slot_increase_axis = "nul"
input_state_slot_decrease = "f6"
input_state_slot_decrease_btn = "nul"
input_state_slot_decrease_axis = "nul"
input_rewind = "r"
input_rewind_btn = "nul"
input_rewind_axis = "nul"
input_movie_record_toggle = "o"
input_movie_record_toggle_btn = "nul"
input_movie_record_toggle_axis = "nul"
input_pause_toggle = "p"
input_pause_toggle_btn = "nul"
input_pause_toggle_axis = "nul"
input_frame_advance = "k"
input_frame_advance_btn = "nul"
input_frame_advance_axis = "nul"
input_reset = "h"
input_reset_btn = "nul"
input_reset_axis = "nul"
input_shader_next = "m"
input_shader_next_btn = "nul"
input_shader_next_axis = "nul"
input_shader_prev = "n"
input_shader_prev_btn = "nul"
input_shader_prev_axis = "nul"
input_cheat_index_plus = "y"
input_cheat_index_plus_btn = "nul"
input_cheat_index_plus_axis = "nul"
input_cheat_index_minus = "t"
input_cheat_index_minus_btn = "nul"
input_cheat_index_minus_axis = "nul"
input_cheat_toggle = "u"
input_cheat_toggle_btn = "nul"
input_cheat_toggle_axis = "nul"
input_screenshot = "f8"
input_screenshot_btn = "nul"
input_screenshot_axis = "nul"
input_audio_mute = "f9"
input_audio_mute_btn = "nul"
input_audio_mute_axis = "nul"
input_netplay_flip_players = "i"
input_netplay_flip_players_btn = "nul"
input_netplay_flip_players_axis = "nul"
input_slowmotion = "e"
input_slowmotion_btn = "nul"
input_slowmotion_axis = "nul"
input_enable_hotkey = "nul"
input_enable_hotkey_btn = "nul"
input_enable_hotkey_axis = "nul"
input_volume_up = "add"
input_volume_up_btn = "nul"
input_volume_up_axis = "nul"
input_volume_down = "subtract"
input_volume_down_btn = "nul"
input_volume_down_axis = "nul"
input_overlay_next = "nul"
input_overlay_next_btn = "nul"
input_overlay_next_axis = "nul"
input_disk_eject_toggle = "nul"
input_disk_eject_toggle_btn = "nul"
input_disk_eject_toggle_axis = "nul"
input_disk_next = "nul"
input_disk_next_btn = "nul"
input_disk_next_axis = "nul"
input_grab_mouse_toggle = "f11"
input_grab_mouse_toggle_btn = "nul"
input_grab_mouse_toggle_axis = "nul"
input_menu_toggle = "f1"
input_menu_toggle_btn = "nul"
input_menu_toggle_axis = "nul"
core_specific_config = "false"
libretro_log_level = "0"
log_verbosity = "false"
perfcnt_enable = "false"
libretro_directory = "/usr/lib/libretro"
libretro_path = "/usr/lib/libretro/mednafen_psx_libretro.so"
libretro_info_path = ""
cheat_database_path = ""
video_shader = ""
audio_device = ""
audio_dsp_plugin = ""
extraction_directory = ""
video_filter = ""
game_history_path = ""
joypad_autoconfig_dir = ""
input_overlay = ""
netplay_nickname = ""
netplay_ip_address = ""
input_joypad_driver = ""
input_keyboard_layout = ""

I am using the open-source radeon driver (with a Radeon HD4870). If it were a driver issue though, I would expect standalone Desmume to be slow too.

Any advice on how I might begin digging into this myself? I'm a reasonably competent programmer, but haven't done much with game or emulator code. Would I use something like gprof?

sergiobenrocha2 commented 9 years ago

Using a core 2 duo T6600 2.20 GHz New Super Mario Bros. (0434)

60 fps with JIT in standalone, 48 fps with interpreter.

44 fps with JIT in libretro core, 35 fps with interpreter. And if Audio Driver --> Null, 60 fps in JIT and interpreter.

Hard GPU Sync and Rewind are both off

inactive123 commented 9 years ago

What I noticed when running it with sudo perf top is that audio_batch_cb is being spammed a lot. It easily is the 'warmest' function out of all profiled functions, so something must be obviously going wrong here.

So there might be a libretro integration issue with the core to do with the frequency at which audio samples are pushed. I'd appreciate some help here from guys like @JackosDev to investigate if we can maybe get better performance by doing some tweaks to the way audio is being processed right now.

JackosDev commented 9 years ago

I've been very busy lately, but once I have time I'll take a look into it.

inactive123 commented 9 years ago

You should see a very drastic performance increase as of this commit -

https://github.com/libretro/desmume/commit/b0297e7e6eae30cd3b3e83c2786ee6ae55e4320d

To the OP: let me know if this brings it up to parity and if we can close this issue then.

sergiobenrocha2 commented 9 years ago

Testing with the Core 2 Duo: I can see 52~56 fps now in the world map of New Super Mario Bros., and 60 fps in the phase. It's much better than before.

sergiobenrocha2 commented 9 years ago

It randomly slows down to 36 fps in my core i7, after playing some time.

andres-asm commented 9 years ago

in what platform? I have the same in windows, it's fixed disabling hard gpu sync for me

inactive123 commented 9 years ago

Dual SPU and SPU Interpolation core options were causing some issues here. Turns out both are not really needed and we will want to have the SPU always be synchronous for libretro.

Anyway, I pushed some changes, and removed the Dual SPU/SPU Interpolation codepaths (SPU Interpolation was never a NDS hardware feature anyway and both interpolation levels had audio quality drawbacks so it might just be more trouble than it's worth). The performance should be stable now.

http://wiki.desmume.org/index.php?title=DeSmuME_Manual_for_the_Windows_port#Config_.7C_Sound_Settings

sergiobenrocha2 commented 9 years ago

I can't reproduce the issue with kernel 3.13, which has the linux-tools, heh

In normal use, I can see that:

55,99%  [.] void renderline_textBG<false>(GPU*, unsigned short, unsigned short, unsigned short)
7,15%  [.] GPU_RenderLine(NDS_Screen*, unsigned short, bool)
6,72%  [.] void GPU::_spriteRender<(GPU::SpriteRenderMode)0>(unsigned short*, unsigned char*, unsigned char*, unsigned char*)
4,62%  [.] void RasterizerUnit<true>::runscanlines<true, false>(PolygonAttributes const&, FragmentColor*, unsigned long, unsigned long, edge_fx_fl*, edge_fx_fl*, bool, bool)
[...]
sergiobenrocha2 commented 9 years ago

I can "force" the issue if I run anything that use much CPU resource, and even if I turn off it the issue remains in desmume, for a time:

issue 0

issue 3

issue 4

andres-asm commented 9 years ago

Yeah I think this is not a performance issue per-se but something internal to the core screwing us up

sergiobenrocha2 commented 9 years ago

Getting lot's of "RetroArch [INFO] :: [PulseAudio]: Underrun (Buffer: 18432, Writable size: 18432)."

dhhdev commented 9 years ago

On Windows 10 with a freshly installed RetroArch (Latest Nightly Build of this comment) and a few driver tweaks as well as having Hard GPU Sync off and rewind off. I managed to get 59/60 FPS stable. (Before changes from standard configuration, I got as low as 35 FPS at sound intensive scenes. Like the loading splash for Pokémon White Version 1.) And no glitches in the sound. The drivers I switched from - to. Audio Driver "dsound", switched to "xsound." Video driver is still "gl."

I tried the most recent stable version DesMuMe standalone and that worked flawlessly, even with added filters and stuff though. So I'm not sure. I do remember a few months back though, having the same logs on my Arch Distro regarding the PulseAudio sergiobenrocha2 had. Maybe that information helps.

alkaseltzerspadt commented 8 years ago

@chrisacheson Hello there. Sorry, I know I am very late. (2 Years Actually) If you still use this program, I think I may have found the solution. Try going into the RGUI settings and turn on Threaded Video. For me, it kept me at a constant audio rate and FPS. I hope things have resolved, and I hope this helps others too. Have a good one.

JorgeHawkins commented 7 years ago

Games like Mario Kart DS keep running at an inconsistent framerate. Full fps as long as there are NO OTHER KARTS on the same screen. I have to toggle frameskip to 1 to play without issues, but I do not like frameskipping.

-- Specs: Core 2 Quad Q8200 @ 2.4 GHz RAM: 8 GB OS: Windows 10 HDD: 2 x 512 GB