libretro / pcsx_rearmed

ARM optimized PCSX fork
GNU General Public License v2.0
165 stars 118 forks source link

lightrec's threaded compiler deadlocks on x86_64 #697

Closed notaz closed 10 months ago

notaz commented 1 year ago

Alien Resurrection - while booting Driver 2 - before the main menu Duke Nukem - Land Of The Babes - before the main menu ...

pcercuei commented 1 year ago

@notaz I generally test only with the standalone emulator, and the deadlocks you report do not happen there (I can see them on the libretro core though).

I build the standalone version from your repository + some commits on top, but I'm always lagging a bit behind. My tree is https://github.com/pcercuei/pcsx_rearmed/tree/lightrec

My guess is that the deadlock appeared in one of the upstream commits since https://github.com/notaz/pcsx_rearmed/commit/d9e2b173fb11fea4976fb0a6c5feda6b654b4b46 (which is the base of my own branch).

notaz commented 1 year ago

Hmm if I checkout at the same commit point of libretro fork (7f9e081d2ab141e7e3f7a84cb2003880d48f8942), the deadlock already happens. Perhaps the fork is missing something from your tree? I think CPU emulation related things should be in sync between libretro fork and my repo.

BTW you can now build the standalone in libretro fork too if you pull the submodules.

git clone https://github.com/libretro/pcsx_rearmed.git
cd pcsx_rearmed/
git submodule init && git submodule update
./configure
make HAVE_CHD=1 DYNAREC=lightrec
pcercuei commented 1 year ago

@notaz I rebased my "lightrec" branch onto your latest master (d5aeda2). It still works fine, no crashes on these games here.

pcercuei commented 1 year ago

@notaz everything even works perfectly fine here with the standalone build created from latest libretro/master (f94d3b198b), after I save the global config in the menu (to force the BIOS file from being detected, otherwise it tries to use HLE which Lightrec doesn't support yet).

notaz commented 1 year ago

Today I tried several things (different bios, clean checkout) and standalone libretro/master deadlocks no matter what (trying Alien Resurrection here). Can you try a debug build (after pull, make DEBUG=1) to rule out compiler differences? Tested with bios: SCPH1001.BIN scph7001.bin etc.

Also, it you don't load a game from command line and use a menu, or load a working game and Esc and select "reset game", it crashes in generated code reading 1f000084 (seems to be a raw emulated pointer of BIOS checking for an expansion device).

Edit: as for your fork, I have no idea which exact dependencies to use so gave up after it failed to compile.

pcercuei commented 1 year ago

Works with make DEBUG=1 as well.

Can you give me a hash of your bios? My MD5 is 924e392ed05558ffdb115408c263dccf.

pcercuei commented 1 year ago

@notaz Here a standalone build does not seem to use Lightrec's memory map (LIGHTREC_CUSTOM_MAP is zero) while the libretro core will. It should still work though.

Could you uncomment lines 410-414 of libpcsxcore/lightrec/plugin.c, and tell me what memory map you obtain?

notaz commented 1 year ago

My BIOS is the same as yours.

Could you uncomment lines 410-414 of libpcsxcore/lightrec/plugin.c

Memory map is sub-par. Emitted code will be slow.
M=0x80000000, P=0x80200000, R=0x1fc00000, H=0x1f800000

with LIGHTREC_CUSTOM_MAP=1:

Memory map is sub-par. Emitted code will be slow.
Using 32-bit LUT
M=0x10000000, P=0x2f000000, R=0x2fc00000, H=0x2f800000

And still a deadlock. No idea what is that we have so different, I guess I'll give up on it for now. I haven't tried to make the memory address 0 mappable, no desire to do it as it opens up all kinds of kernel exploits.

pcercuei commented 1 year ago

@notaz try to disable the threaded compiler, ENABLE_THREADED_COMPILER=0 in include/lightrec/lightrec-config.h and remove recompiler.o and reaper.o from the list of object files.

pcercuei commented 1 year ago

@notaz Actually could you try a build with GNU Lightning at version 729225d? As far as I can see commit 0e6dd94 (the following one) introduced a regression where live registers are incorrectly detected as dead and used as temporaries. It definitely triggers on PowerPC but the code is arch-independent, so it may very well be what you're seeing here (although it wouldn't explain why I don't get the crashes myself).

notaz commented 1 year ago

(still testing Alien Resurrection) 729225d - no more deadlock, boots correctly, ingame main menu is very slow (feels like some cycle counting issue) 0e6dd94 - no more deadlock but various misbehavior (FMV not displaying, etc), slow main menu

Both still have standalone emu menu's "reset game" segfault, as described before.

notaz commented 1 year ago

Forgot I've also set ENABLE_THREADED_COMPILER=0 in the above test.

With ENABLE_THREADED_COMPILER=1: 729225d - deadlock 0e6dd94 - no deadlock but various misbehavior, same as in previous post

pcercuei commented 1 year ago

These issues you are having make no sense to me, Alien Resurrection (SLUS00633) works perfectly fine here even with the bogus Lightning commit. The menu does not look particularly slow to me, it looks fullspeed. Are you sure you don't have something in your config (e.g. CPU cycles bias) that would mess things up?

notaz commented 1 year ago

I was testing the libretro fork (because it has everything baked in), and it turns out AR is broken there even without lightrec enabled. Some drive-by contributors probably messed something up as my tree has no such problems.

bslenul commented 1 year ago

FWIW, this is what I get with 8622c9d on RetroArch with lightrec and with a BIOS:

LibretroAdmin commented 1 year ago

Hi, how can we fix things here? This is troubling to hear. @notaz any suggestions?

notaz commented 1 year ago

Maybe disable ENABLE_THREADED_COMPILER until it's fixed. I'm not 100% sure but I think that solved the issue (can't sink time into this right now to confirm).

pcercuei commented 1 year ago

Well, didn't you say it also happens even without Lightrec enabled? That would mean the threaded compiler is not the problem...

bslenul commented 1 year ago

Maybe disable ENABLE_THREADED_COMPILER until it's fixed. I'm not 100% sure but I think that solved the issue (can't sink time into this right now to confirm).

I imagine this is more a workaround than a fix but yeah, on Windows no more issue with Duke Nukem, and on my Linux VM the 3 games now launch properly as well.

notaz commented 1 year ago

Well, didn't you say it also happens even without Lightrec enabled? That would mean the threaded compiler is not the problem...

That was for the main menu slowdown in Alien Resurrection. The deadlock (of the whole emulator) is a separate issue.

pcercuei commented 1 year ago

Understood.

pcercuei commented 1 year ago

I created #705 to disable the threaded compiler. But it doesn't fix the underlying problem...

notaz commented 11 months ago

Is the threaded compiler still a thing, or can we just close this? How viable is it anyway since we can't execute what's not compiled yet?

pcercuei commented 11 months ago

@notaz it is still very much a thing, but it has a rare race that is hard to trigger (not deterministic) and that I didn't fix yet (and I have no idea how to use valgrind). When the threaded compiler is enabled, Lightrec will just use the interpreter for any new block until those are compiled, which is faster than compiling each and every block in-line.