RPCS3 / rpcs3

PlayStation 3 emulator and debugger
https://rpcs3.net/
GNU General Public License v2.0
15.64k stars 1.92k forks source link

Cannot launch games on Linux build (F LDR: St11range_error thrown: Narrow error #2516

Closed thecityofguanyu closed 7 years ago

thecityofguanyu commented 7 years ago

Have seen this with the two games that I successfully tested on Windows 10 -- Catherine and Hatsune Miku: Project DIVA F. On Linux, they both resulted in the following error in the log:

F LDR: St11range_error thrown: Narrow error (0x56295a14cf60)

I'm running the current latest RPCS3 build, b70a1edb. Linux OS info is 4.9.0-2-amd64 #1 SMP Debian 4.9.13-1 (2017-02-27) x86_64 GNU/Linux.

Dependency info:

/opt/rpcs3/bin/ sudo dpkg -l | egrep "(gcc-5)|(clang)" 
ii  clang                                          1:3.8-34+b1                          amd64        C, C++ and Objective-C compiler (LLVM based)
ii  clang-3.8                                      1:3.8.1-17                           amd64        C, C++ and Objective-C compiler (LLVM based)
ii  gcc-5                                          5.4.1-8                              amd64        GNU C compiler
ii  gcc-5-base:amd64                               5.4.1-8                              amd64        GCC, the GNU Compiler Collection (base package)
ii  libclang-common-3.8-dev                        1:3.8.1-17                           amd64        clang library - Common development package
ii  libclang1-3.8:amd64                            1:3.8.1-17                           amd64        C interface to the clang library
ii  libgcc-5-dev:amd64                             5.4.1-8                              amd64        GCC support library (development files)
John-Gee commented 7 years ago

radeonsi here

Enverex commented 7 years ago

Same here - OpenGL renderer string: Gallium 0.4 on AMD POLARIS10 (Mesa 17.0.3 / DRM 3.9.0 / 4.10.10-1-ARCH, LLVM 3.9.1)

kd-11 commented 7 years ago

So, just to be clear, everyone getting the llvm error is on radeonsi, right? I'm on radeonsi as well

deepbluev7 commented 7 years ago

I have no problem running rpcs3 on gentoo with radeonsi and compiled from source. Couldn't get the flatpak running, because I didn't want to recompile my kernel with the features needed for flatpak, however the llvm problem looks familiar (I've had a similar error with radv and mesa). I'm guessing you are bundling llvm with the flatpak and loading that shared lib. Then it's probably similar to this: https://lists.freedesktop.org/archives/mesa-dev/2016-October/130765.html This would happen if the radeonsi and rpcs3 both load llvm.

TingPing commented 7 years ago

I'm guessing you are bundling llvm with the flatpak and loading that shared lib. Then it's probably similar to this: https://lists.freedesktop.org/archives/mesa-dev/2016-October/130765.html This would happen if the radeonsi and rpcs3 both load llvm.

Yes that was my conclusion too.

TingPing commented 7 years ago

So for now I've disabled llvm on the flatpak until we figure it out.

Using a shared llvm I get this error:

[100%] Linking CXX executable ../bin/rpcs3
/usr/lib/gcc/x86_64-unknown-linux/6.2.0/../../../../x86_64-unknown-linux/bin/ld: CMakeFiles/rpcs3.dir/Emu/Cell/PPUTranslator.cpp.o: undefined reference to symbol '_ZN4llvm12ConstantExpr9getSelectEPNS_8ConstantES2_S2_PNS_4TypeE'
//lib/libLLVMCore.so.4: error adding symbols: DSO missing from command line

EDIT: Seems I have figured it out....

hcorion commented 7 years ago

@TingPing check out this article: http://stackoverflow.com/questions/19901934/strange-linking-error-dso-missing-from-command-line

deepbluev7 commented 7 years ago

@TingPing I have a similar error on gentoo, because the llvm libs are build as separate shared objects. Changing the used llvm components in rpcs3/CMakeLists.txt to 'all' fixes the linker error. I couldn't figure out a sane way to fix that, so that's a workaround.

TingPing commented 7 years ago

Yup got the linking fixed (this should be fixed upstream) but at runtime I now get this crash:

#0  0x00000032efe3304f in raise () from /lib/libc.so.6
#1  0x00000032efe3447a in abort () from /lib/libc.so.6
#2  0x00007fd264e17e9d in __gnu_cxx::__verbose_terminate_handler() () from /lib/libLLVMDemangle.so.4
#3  0x00007fd264dad226 in __cxxabiv1::__terminate(void (*)()) () from /lib/libLLVMDemangle.so.4
#4  0x00007fd264dad271 in std::terminate() () from /lib/libLLVMDemangle.so.4
#5  0x00007fd264dac2c8 in __cxa_throw () from /lib/libLLVMDemangle.so.4
#6  0x00007fd264da88f2 in std::__throw_bad_cast() () from /lib/libLLVMDemangle.so.4
#7  0x0000000000ab3929 in std::__detail::_Scanner<char>::_Scanner(char const*, char const*, std::regex_constants::syntax_option_type, std::locale) ()
#8  0x0000000000ac1984 in std::__detail::_Compiler<std::__cxx11::regex_traits<char> >::_Compiler(char const*, char const*, std::locale const&, std::regex_constants::syntax_option_type) ()
#9  0x0000000000ac1f75 in std::enable_if<std::__detail::__is_contiguous_normal_iter<char const*>::value, std::shared_ptr<std::__detail::_NFA<std::__cxx11::regex_traits<char> > const> >::type std::__detail::__compile_nfa<char const*, std::__cxx11::regex_traits<char> >(char const*, char const*, std::__cxx11::regex_traits<char>::locale_type const&, std::regex_constants::syntax_option_type) ()
#10 0x0000000000ac20df in std::__cxx11::basic_regex<char, std::__cxx11::regex_traits<char> >::basic_regex(char const*, std::regex_constants::syntax_option_type) ()
#11 0x00000000006eae38 in _GLOBAL__sub_I__ZN3vfs5mountERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES7_ ()
#12 0x0000000000ea084d in __libc_csu_init ()
#13 0x00000032efe20220 in __libc_start_main () from /lib/libc.so.6
#14 0x00000000006f38ca in _start () at ../sysdeps/x86_64/start.S:120
deepbluev7 commented 7 years ago

Looks like there is still some mismatch with the linked libs as _libc_csu_init is the initialization routine for global variables and a bad_cast suggests, that it does a dynamic_cast, where the type doesn't match i.e. if the type is different in the lib loaded than in the lib it was compiled with or something like that. How did you solve your linker errors?

TingPing commented 7 years ago

@MonokelPinguin Like you mentioned listing more required components or just using all.

deepbluev7 commented 7 years ago

Do you bundle llvm with the flatpack?

TingPing commented 7 years ago

LLVM is part of the base runtime that already exists, here are its build flags for the curious: https://github.com/flatpak/freedesktop-sdk-images/blob/1.6/org.freedesktop.Sdk.json.in#L1052-L1112

So both the driver and the application are built against the same shared library.

deepbluev7 commented 7 years ago

Did you compile with gcc or with clang? Also, does removing the second argument of s_regex_ps3 and s_regex_psv in rpcs3/Emu/VFS.cpp change anything? (remove the std::regex::optimize)

TingPing commented 7 years ago

GCC as the build system forced it, just as test bypassing it with clang has the same result.

Also, does removing the second argument of s_regex_ps3 and s_regex_psv in rpcs3/Emu/VFS.cpp change anything?

No change.

deepbluev7 commented 7 years ago

I'm running out of ideas. Kdevelop seems to manage to bundle a newer llvm. Maybe you could try to base your flatpak on that?

TingPing commented 7 years ago

I'm running out of ideas. Kdevelop seems to manage to bundle a newer llvm. Maybe you could try to base your flatpak on that?

We already tried static builds of llvm and saw how that went.

deepbluev7 commented 7 years ago

I don't think kdevelop links statically, but it uses it's own bundled llvm that's built as a single lib (that's how it is recommended to build llvm). Also kdevelop needs graphics acceleration and load llvm, so they seem to have solved the problem.

hcorion commented 7 years ago

This has gone on for quite a bit, and no longer relates to the original issue, perhaps we could move to a new, seperate issue about AppImages and discuss it there?

probonopd commented 7 years ago

@hcorion opened #2744

John-Gee commented 7 years ago

Well, I just rebuilt RPCS3 and I've been able to play Hatsune Miku Project Diva F. I think it had been a year or so since I had been able to start it, so good job!

My spec: Arch Linux x64 with radeonsi (git) and llvm (svn, 5.0 trunk), both interpreter fast. asmjit works too but seems slower at a quick glance.

So in my case this bug is solved...

Enverex commented 7 years ago

So what are the actual requirements for this to work right now? I built from GIT last night and still encountered the same error (Arch, LLVM4, Nvidia binary drivers). Is LLVM5 the key?

John-Gee commented 7 years ago

Well I cannot use the llvm backend, so I doubt it but I don't know. I'm on the testing branch by the way.

refi64 commented 7 years ago

@Enverex If you're using GCC, try switching to Clang.

Enverex commented 7 years ago

No change. Same error (clang version 4.0.0).

hcorion commented 7 years ago

Yes, use LLVM 4 not LLVM 5.

al0xf commented 7 years ago

Can someone confirm that the error goes away compiling with LLVM5 and changing nothing else?

hcorion commented 7 years ago

@al0xf Perhaps @Enverex could try installing llvm-svn and clang-svn from the Arch Linux AUR, you seem to be the only one still affected by this issue in the thread.

Enverex commented 7 years ago

I have the SVN version of LLVM/Clang installed now but I assume I need to modify the makefile to like it? As currently it complains...

CMake Warning at rpcs3/CMakeLists.txt:95 (find_package):
  Could not find a configuration file for package "LLVM" that is compatible
  with requested version "4.0".

  The following configuration files were considered but not accepted:

    /usr/lib64/cmake/llvm/LLVMConfig.cmake, version: 5.0.0svn-r302997
    /usr/lib/cmake/llvm/LLVMConfig.cmake, version: 5.0.0svn-r302997
    /lib64/cmake/llvm/LLVMConfig.cmake, version: 5.0.0svn-r302997
    /lib/cmake/llvm/LLVMConfig.cmake, version: 5.0.0svn-r302997
Enverex commented 7 years ago

Ok, forced it through by telling it just to look for LLVM rather than LLVM 4.0. Wouldn't build though...

[ 38%] Building CXX object rpcs3/CMakeFiles/rpcs3.dir/Emu/Cell/SPURecompiler.cpp.o
/mnt/store/Build/rpcs3-git/src/rpcs3/rpcs3/Emu/Cell/PPUTranslator.cpp:71:59: error: no member named 'FunctionIndex' in 'llvm::AttributeSet'
        , m_pure_attr(AttributeSet::get(m_context, AttributeSet::FunctionIndex, {Attribute::NoUnwind, Attribute::ReadNone}))
                                                   ~~~~~~~~~~~~~~^
/mnt/store/Build/rpcs3-git/src/rpcs3/rpcs3/Emu/Cell/PPUTranslator.cpp:135:27: error: no member named 'getArgumentList' in 'llvm::Function'
        m_thread = &*m_function->getArgumentList().begin();
                     ~~~~~~~~~~  ^
/mnt/store/Build/rpcs3-git/src/rpcs3/rpcs3/Emu/Cell/PPUTranslator.cpp:154:96: error: no matching constructor for initialization of 'llvm::AllocaInst'
        for (u32 i = 0; i < 32; i++) if (!m_vr[i]) m_vr[i] = m_g_vr[i] ? m_g_vr[i] : m_ir->Insert(new AllocaInst(GetType<u32[4]>(), nullptr, 16, fmt::format(".v%d", i)));
                                                                                                      ^          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
hcorion commented 7 years ago

@Enverex Yes, compile RPCS3 without LLVM, make sure it's looking for 4.0, because 5.0 has new changes that RPCS3 hasn't adapted for. You just need to make sure to compiler rpcs3 with clang rather than gcc.

Enverex commented 7 years ago

Ah, I was compiling with Clang but hadn't disabled LLVM. Ok, done that. Same error. Are you sure there's nothing else that can cause this error message? That said, I just noticed the error is the same but the number has changed, it's now - F LDR: St11range_error thrown: Narrow error (0x5586ef454f40)

hcorion commented 7 years ago

@Enverex Nobody really knows what's causing the error message, but for some reason some people get it and some don't, and then all of a sudden it starts working for them, after a restart, or just randomly after a re-build.

@al0xf We have confirmed that switching to LLVM 4 or 5 doesn't fix the issue.

mirh commented 7 years ago

So... Just for [some?] records.. I tried to check wherever the heck this St11range_error is placed, and I found out the only place on earth it seems to be is Red Hat's devtoolset-gcc (and compat-sap) libstdc++-compat patch of file stdexcept.cc (which makes sense given we are thrown an exception).

Now, for as much as normally that would be meaningless (I mean, it's simply the function in charge of returning errors + this) my question is: wtf would you even find such a package under arch?

ghost commented 7 years ago

I would like to get an idea o how people are debugging this. Is debugging occurring through the GDB or Strace or something else? I would like to take a crack at figuring this out since I use Linux but wanted to get some leads for where others have looked or started to look.

hcorion commented 7 years ago

@maximstewart Are you having this issue as well? I've used GDB in the past, I'm not familiar with Strace though, because of the way the emulator is built, you may receive false positives which gdb will catch and try to inform you, but are actually normal operation.

For this issue you could probably use GDB, just compile rpcs3 in debug mode, and then see if GDB stops reports when you run a game.

kd-11 commented 7 years ago

LLVM issues should be resolved with https://github.com/RPCS3/rpcs3/pull/2811

Enverex commented 7 years ago

Has anyone actually confirmed that this fixes the "St11range_error thrown: Narrow error" issue? I can test it shortly if not.

Enverex commented 7 years ago

Confirming that this issue is not fixed.

image

kd-11 commented 7 years ago

We need to gather a list of conditions needed to recreate this bug, so I can recreate it in a VM.

PopusBenedictus commented 7 years ago

I can confirm this to be an issue for me using Arch and just using the rpcs3-git package from AUR. Out of curiosity, I attempted to run the emulator with valgrind. Enabling the llvm recompiler and running Gran Turismo 6 did not yield the error in this case. Of course, the recompiler failed on an unrelated error due to an unsupported syscall.

I then tried with the interpreter* (fast) and that got the game window open, albeit black (mostly because valgrind was slowing everything way down).

I exited out of the program, and then attempted to run the emulator outside of it and launch the same game with both the llvm recompiler selected and the fast interpreter. In both cases I got the narrow error prompt again.

I should note that Arch ships llvm 4.0.0 from its own repositories, and I do not have older versions of llvm or clang installed. I'm not too adept with valgrind but it's doing something to rpcs3 that avoids the precondition for the bug.

Enverex commented 7 years ago

I built LLVM/Clang 5 from GIT and it still happened there, so it isn't 4.x specific.

PopusBenedictus commented 7 years ago

I inserted a bunch of log calls and chased the problem down to this: https://github.com/RPCS3/rpcs3/blob/f010b5b235a9cda9ad1bcf84ae972c7e6c3de76b/rpcs3/Emu/Cell/PPUThread.cpp#L179

I'm not a C++ expert so I might be making some naive/incorrect assumptions, but I wonder if we are seeing different behavior on one of the casts between platforms and its causing a range error to be thrown when Emulator::Load() is called.

What I see: Within Load(), ppu_load_exec is called ppu_load_exec calls ppu_register_range() and exception thrown at point linked above. Exception is caught in Load(), fatal error is logged, and the game never loads.

kd-11 commented 7 years ago

It seems to be an arch issue only as far as I can tell. That helps since I should be able to recreate it. It was already known that it happens in emu::load but thanks for your assistance tracking the exception down. I'll try to recreate the conditions in an arch vm and fix this once and for all.

John-Gee commented 7 years ago

I'm on arch, I had the issue before, but not anymore (well at least when I tried a week or 2 ago) so I'm not convinced that it really is related to Arch. That could be because I'm using mesa and llvm from the mesa-git repository and maybe some recent changes fixed the issue there..

XenonPK commented 7 years ago

I can reproduce this on OpenSUSE Tumbleweed, with this package.

EDIT: Qt 5.9 & gcc 7

hcorion commented 7 years ago

@XenonPK Ok, here is what I want you to do. Compile it with clang, but I want you to add this to rpcs3/CMakeLists.txt something like line 105

set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -image-base=0x10000")
gourdcaptain commented 7 years ago

Tried the CMake modification in hcorion's post above and it worked on Arch Linux 64-bit with Clang 4.0 and an AMD RX 460 running the open-source drivers and I got a game running in RPCS3 for the first time. (Version v.0.0.2-5291-9cc52c75 is what it calls itself.)

Enverex commented 7 years ago

I tried adding that to line 105 and although it compiles and launches fine, it immediately exits (without error or output on the terminal) when I try and launch a game.

XenonPK commented 7 years ago

Got it to open the game's window with those linker changes (shovel knight OGL)