Closed orbea closed 7 years ago
GLideN64-2017.01.31_bf1e748_master-x86_64-1_git
With GLideN64 it will segfault when starting instead of closing.
Thread 1 "mupen64plus" received signal SIGABRT, Aborted.
0x00007ffff731a5cf in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff731a5cf in raise () from /lib64/libc.so.6
#1 0x00007ffff731c1ba in abort () from /lib64/libc.so.6
#2 0x00007ffff7312b67 in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007ffff7312c12 in __assert_fail () from /lib64/libc.so.6
#4 0x00007fffeffd3b40 in _createShader (_type=35633,
_strShader=0x7ffff0089100 "#version 330 core \nin highp vec4 aPosition;\t\t\t\t\t\t\t\nin lowp vec4 aColor;\t\t\t\t\t\t\t\t\nin highp vec2 aTexCoord0;\t\t\t\t\t\t\t\nin highp vec2 aTexCoord1;\t\t\t\t\t\t\t\nin lowp float aNumLights;\t\t\t\t\t\t\t\nin highp vec4 aModify"...)
at /tmp/SBo/GLideN64/src/OGL3X/GLSLCombiner_ogl3x.cpp:206
#5 0x00007fffeffd3b61 in InitShaderCombiner ()
at /tmp/SBo/GLideN64/src/OGL3X/GLSLCombiner_ogl3x.cpp:212
#6 0x00007fffeff82b68 in Combiner_Init ()
at /tmp/SBo/GLideN64/src/Combiner.cpp:77
#7 0x00007fffeffafc7b in OGLRender::_initData (
this=0x7ffff06dc5d0 <OGLVideo::get()::video+1104>)
at /tmp/SBo/GLideN64/src/OpenGL.cpp:2186
#8 0x00007fffeffa7792 in OGLVideo::start (
this=0x7ffff06dc180 <OGLVideo::get()::video>)
at /tmp/SBo/GLideN64/src/OpenGL.cpp:160
#9 0x00007fffeffcd3ea in PluginAPI::RomOpen (
this=0x7ffff06dbb10 <PluginAPI::get()::api>)
at /tmp/SBo/GLideN64/src/common/CommonAPIImpl_common.cpp:209
#10 0x00007fffeffcd760 in RomOpen ()
at /tmp/SBo/GLideN64/src/MupenPlusPluginAPI.cpp:8
#11 0x00007ffff56562d3 in main_run () at ../../src/main/main.c:981
#12 0x0000000000402050 in main (argc=6, argv=0x7fffffffe1c8)
at ../../src/main.c:782
GDB log - http://pastebin.com/3VyQWiTC
The second crash seems to be unrelated... Older commits of GLideN64 will segfault when closed too.
I bisected it and found the first bad commit.
commit ed1c410f17e78709dfc6050fe4ccafb6d7f3c3a9
Author: Bobby Smiles <bobby.smiles32@gmail.com>
Date: Thu Dec 1 17:11:46 2016 +0100
Move rjump variables inside r4300_core struct.
:040000 040000 e8c1d77d6d9334f528891b39862715b4b63d892f 5ea9932890561262d73f7c34c1b77d16fb9d5326 M src
ed1c410f17e78709dfc6050fe4ccafb6d7f3c3a9 @bsmiles32 Can you take a look at this?
Thanks for bissecting. I don't really see what could cause the segfault here. And I didn't experienced it. I'll take a look, but don't have much time atm.
@bsmiles32 Reverting the changes in src/device/r4300/x86_64/rjump.c
works around the segfault. The issue has to be in there.
@orbea could you try replacing:
" sub $0x10, %%rsp \n"
by
" sub $0x20, %%rsp \n"
in rjump.c
I remember experiencing a similar issue in windows (see dyna_start.asm)
I tried that and it still segfaults unfortunately.
Try increasing the value.
I did, to 50, 100 and 1000, no changes.
Thanks for testing I just wanted to be sure this was not a stack issue. This might also be a compiler issue when trying to build the intrinsic asm code in rjump.c We could try to build dyna_start.asm instead of rjump.c just to be sure, but I'll need to make some changes to the Makefile for this. In the meantime could you do a objdump -d on mupen64plus-core.so so I can check the disassembly. If you know how to generate the map file this would be helpfull also.
The map file should be generated when executing the following command before doing a make all: export LDFLAGS="-Wl,-Map=output.map"
I attempted to build it with those LDFLAGS and then got the resulting output from objdump -d, does this help?
objdump -d: http://ks392457.kimsufi.com/orbea/stuff/logs/mupen64plus_objdump.log
output.map: http://pastebin.com/T71qb5NG
Thanks, I checked the disassembly and everything looks good. The last issue I can think of is memory corruption, you should check that the value of g_dev.r4300.save_rsp and g_dev.r4300.save_rip are not overwritten during execution. Put a breakpoint in dyna start with gdb and step through instructions to get the value of rsp when reaching mov %%rsp, %[save_rsp] and the value of rax when reaching mov %%rax, %[save_rip]. Use the next command when reaching call *%%rbx. gdb should break in mov %[save_rsp], %%rsp when closing emulation, check the value of rsp after the instruction.
I'm not all that experienced at gdb, but i gave it a try. I could of misunderstood something... http://pastebin.com/H4ZkQwUT
I forgot to say you need to run the following command to show disassembly: set disassemble-next-line
Alright, this is hopefully more useful? http://pastebin.com/NFhpsGHK
Comment the debug message line to jump directly to the usefull code and use stepi to step to next instruction and nexti to step over a call instruction, you're almost a gdb expert ;-)
I tried that, is this better? http://pastebin.com/4exzJEF4
I tried using nexti
on any call instructions and the game started after the last one where I then exited normally and continued with stepi
until it segfaulted...
Oh I see what's happening here, the compiler allocated an additional register "rdx" to store the base address of the g_dev structure or rdx obviously doesn't contain that address anymore when you stop emulation and jump back to the mov 0xa00(%rdx),%rsp
instruction. Using dyna_start.asm instead of the intrinsic asm code will fix the issue.
Nice catch @Gillou68310 ! So maybe putting rdx (and any registers modified inside the call rbx) in the clobber list would fix this ?
@bsmiles32 I guess it should work too, but honestly I prefer using asm files. I guess it's just a matter of taste but at least I'm sure gcc won't use registers that I didn't explicitly allocated ;-)
@orbea could you test this branch https://github.com/Gillou68310/mupen64plus-core/tree/dyna_start_x86_64 Updated
@Gillou68310 Inlined asm has the "benefit" of letting the optimizer do his job, especially register allocation, but as we saw this is a double edged sword. Also inlined asm is GCC only.
Should we also update the x86 version too to use an asm file instead of the inlined asm ?
Should we also update the x86 version too to use an asm file instead of the inlined asm ?
Yes I think we should, I can take care of it if you want.
@Gillou68310 I tried your branch and there is no segfault, thanks!
@orbea you're welcome ;-) Thanks for helping me tracking down this issue.
@bsmiles32 I created the asm file for x86 https://github.com/Gillou68310/mupen64plus-core/commits/dyna_start could you check if it builds fine on linux?
I tested building your new branch on a Slackware-14.2 chroot (X86) and it built fine, my chroot is not set up to try running it though.
Thanks @orbea, I tested on windows and it was running just fine, hopefully it should run fine on linux too ;-)
Since the fix has been merged into the master and I have tested that it is still fixed I am going to close this, thanks!
For reference. https://github.com/mupen64plus/mupen64plus-core/commit/dd05baf40c0043d2b631c2ed2a43e1d62ff2f51e
OS:
Slackware64-current
mupen64plus-2017.02.03_083f825-x86_64-1_git
glide64-2016.09.01_56be25d-x86_64-1_git
When closing games mupen64plus will segfault and it does not seem to matter what plugin.
Here is a backtrace with glide64 and hle.
GDB log - http://pastebin.com/jzbjVWMp