Closed vnen closed 7 years ago
To be clear, you are seeing these issues with the editor or the template? i.e. tools=yes target=release
or tools=no target=release
?
It's not possible to tools=yes target=release
(the editor needs some debug symbols). So I'm using tools=no target=release
.
Testing on Mageia 6, Linux 64-bit using the Godot 2.0.1 templates for:
And I got no segfault in the _drag_anddrop and _inputmapping demos. So I suspect a compiler issue; what platform are you testing on and what compiler did you use? Do you also get a sigserv using the official Godot 2.0.x binaries?
You might be right about the compiler, I have no problem with Godot 2.0.2 release template. I'm on Arch Linux, so always updated GCC (currently version 6.1.1). Will try with clang.
Can this be a compiler bug?
It could be a compiler bug, or a compiler change that renders our code buggy :-P
GCC 6 builds with -std=gnu++14
by default no?
I'll try with GCC 5.3.1 locally.
The release compiled with Clang does not segfault. It is a compiler thing indeed. Need to investigate recent changes in GCC.
GCC 6 builds with -std=gnu++14 by default no?
Yes, and this causes compile errors in old Godot code, so I had to add -std=c++98
in my builds.
@vnen ~ considering that I can/could reproduce this, and that I'm on Arch too, it might be GCC-related indeed
I confirm that it works fine with a Linux X11 64-bit release template built with GCC 5.3.1 (20160503), so it's likely GCC 6-specific yeah.
Does scons have a way to specify the C++ standard being used?
@her001 SCons can add flags to the compiler command line, which in turn can change the C++ standard.
@vnen Is that cross platform? (sounds like no)
Is that cross platform?
It's possible to detect the compiler/platform. But I don't think the C++ standard is causing this specific issue.
Tried a optimized build with debug symbols and got this
Thread 1 "godot.x11.opt.6" received signal SIGSEGV, Segmentation fault.
DVector<String>::resize (this=<optimized out>, p_size=<optimized out>)
at ./core/dvector.h:396
396 memnew_placement(&t[i], T );
(gdb) where
#0 DVector<String>::resize (this=<optimized out>, p_size=<optimized out>)
at ./core/dvector.h:396
#1 0x00000000011fff8c in ResourceInteractiveLoaderBinary::parse_variant (
this=this@entry=0x244e440, r_v=...) at core/io/resource_format_binary.cpp:542
#2 0x00000000011fe869 in ResourceInteractiveLoaderBinary::parse_variant (
this=this@entry=0x244e440, r_v=...) at core/io/resource_format_binary.cpp:461
#3 0x0000000001201b4b in ResourceInteractiveLoaderBinary::poll (this=0x244e440)
at core/io/resource_format_binary.cpp:772
#4 0x0000000001228b2a in ResourceFormatLoader::load (this=<optimized out>, p_path=...,
p_original_path=..., r_error=0x0) at core/io/resource_loader.cpp:131
#5 0x00000000012282b9 in ResourceLoader::load (p_path=..., p_type_hint=...,
p_no_cache=p_no_cache@entry=false, r_error=r_error@entry=0x0)
at core/io/resource_loader.cpp:195
#6 0x0000000000429ff3 in Main::start () at main/main.cpp:1430
#7 0x000000000040d0ed in main (argc=<optimized out>, argv=0x7fffffffe3b8)
at platform/x11/godot_x11.cpp:40
Might be an optimization bug in GCC. I'm not verse in C++ memory management well enough to tell what's wrong with it.
So #4636 is caused by the compiler too, even with a lesser optimization. It might be good to announce that GCC 6 is not really working and maybe report a bug to them (or find what's wrong with Godot code if there's anything).
slapin@slapin-pc:~/godot/demos/2d/platformer$ gcc-6 -v Using built-in specs. COLLECT_GCC=gcc-6 COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 6.1.1-1' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 6.1.1 20160430 (Debian 6.1.1-1) slapin@slapin-pc:~/godot/demos/2d/platformer$ g++-6 -v Using built-in specs. COLLECT_GCC=g++-6 COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 6.1.1-1' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 6.1.1 20160430 (Debian 6.1.1-1)
Debian, can't reproduce this issue. tried release_debug and release targets and platformer demo. Any suggestions for test?
Also, I did not need to setup any -std options. Build log: https://gist.github.com/4783855450fe9e82ad601c6cbd80bec7
I'd suggest add -g option to build and try valgrind
I see the following difference in -O3 optimizations for gcc5 vs gcc6:
--- opt5.txt 2016-05-13 19:10:23.556164665 +0300
+++ opt6.txt 2016-05-13 19:10:12.160114305 +0300
@@ -80,14 +80,12 @@
-fisolate-erroneous-paths-dereference [enabled]
-fivopts [enabled]
-fjump-tables [enabled]
+ -fkeep-gc-roots-live [disabled]
-flifetime-dse [enabled]
+ -flifetime-dse= 0x2
-flive-range-shrinkage [disabled]
- -floop-block [disabled]
- -floop-interchange [disabled]
-floop-nest-optimize [disabled]
-floop-parallelize-all [disabled]
- -floop-strip-mine [disabled]
- -floop-unroll-and-jam [disabled]
-flra-remat [enabled]
-fmath-errno [enabled]
-fmodulo-sched [disabled]
@@ -105,12 +103,14 @@
-fpeel-loops [disabled]
-fpeephole [enabled]
-fpeephole2 [enabled]
+ -fplt [enabled]
-fpredictive-commoning [enabled]
-fprefetch-loop-arrays [enabled]
-freciprocal-math [disabled]
-freg-struct-return [disabled]
-frename-registers [enabled]
-freorder-blocks [enabled]
+ -freorder-blocks-algorithm= stc
-freorder-blocks-and-partition [enabled]
-freorder-functions [enabled]
-frerun-cse-after-loop [enabled]
@@ -142,7 +142,6 @@
-fsel-sched-reschedule-pipelined [disabled]
-fselective-scheduling [disabled]
-fselective-scheduling2 [disabled]
- -fshort-double [disabled]
-fshort-enums [enabled]
-fshort-wchar [disabled]
-fshrink-wrap [enabled]
@@ -151,7 +150,9 @@
-fsimd-cost-model= unlimited
-fsingle-precision-constant [disabled]
-fsplit-ivs-in-unroller [enabled]
+ -fsplit-paths [enabled]
-fsplit-wide-types [enabled]
+ -fssa-backprop [enabled]
-fssa-phiopt [enabled]
-fstack-reuse= all
-fstdarg-opt [enabled]
@@ -168,10 +169,8 @@
-ftree-builtin-call-dce [enabled]
-ftree-ccp [enabled]
-ftree-ch [enabled]
- -ftree-coalesce-inlined-vars [disabled]
-ftree-coalesce-vars [enabled]
-ftree-copy-prop [enabled]
- -ftree-copyrename [enabled]
-ftree-cselim [enabled]
-ftree-dce [enabled]
-ftree-dominator-opts [enabled]
@@ -203,6 +202,7 @@
-ftree-ter [enabled]
-ftree-vectorize [disabled]
-ftree-vrp [enabled]
+ -funconstrained-commons [disabled]
-funroll-all-loops [disabled]
-funroll-loops [disabled]
-funsafe-loop-optimizations [disabled]
You can find this list out by runing:
g++-5 -O3 -Q --help=optimizers >opt5.txt
g++-6 -O3 -Q --help=optimizers >opt6.txt
diff -u opt5.txt opt6.txt
So trying -fno- of added enabled optimizations should be tried first.
I tested with commit 98bff2f595beda46a35ce66ac4d714fbc41ba2ba you will probably want to try this commit and see if it works for you.
@slapin:
tried release_debug and release targets and platformer demo. Any suggestions for test?
In the OP I stated the platformer demo does not give this error, likely because it's not using binary scenes, only .tscn
scenes. The gui/drag_and_drop demo was the first that gave this issue, but any GUI demo seems to cause it.
Also, in the editor with release_debug gives a crash if you add a control, add a new theme to it and try to edit the theme (as seen in #4636). This crash does not happens if built with Clang, nor if built with target=debug.
Yeah, confirm this. checking.
On Fri, May 13, 2016 at 8:25 PM, George Marques notifications@github.com wrote:
@slapin https://github.com/slapin:
tried release_debug and release targets and platformer demo. Any suggestions for test?
In the OP I stated the platformer demo does not give this error, likely because it's not using binary scenes, only .tscn scenes. The gui/drag_and_drop demo was the first that gave this issue, but any GUI demo seems to cause it.
Also, in the editor with release_debug gives a crash if you add a control, add a new theme to it and try to edit the theme (as seen in #4636 https://github.com/godotengine/godot/issues/4636). This crash does not happens if built with Clang, nor if built with target=debug.
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/godotengine/godot/issues/4623#issuecomment-219106777
Well, I did quick analysis.
replace env.Append(CCFLAGS=['-O3','-ffast-math']) with env.Append(CCFLAGS=['-O2','-ffast-math']) env.Append(CCFLAGS=['-fgcse-after-reload', '-finline-functions', '-fipa-cp-clone', '-fpredictive-commoning', '-fsplit-paths', '-ftree-loop-distribute-patterns', '-ftree-loop-vectorize'])
env.Append(CCFLAGS=['-ftree-partial-pre','-ftree-slp-vectorize'])
env.Append(CCFLAGS=['-funswitch-loops'])
Or use env.Append(CCFLAGS=['-O3','-ffast-math', '-fno-vect-cost-model']) And no problem occurs. The above is safer and gives more control the culprit is -fvect-cost-model=dynamic gcc option which is added by gcc -O3.
The reason is probably alignment issues which are triggered (trapped) by vectorized code. Somebody should try building Godot with -fsanitize=undefined and see the traps.
The removal of -fvect-cost-model=dynamic just hides an issue, not fixing it, so consider it a workaround.
On Fri, May 13, 2016 at 8:32 PM, Sergey Lapin slapinid@gmail.com wrote: @
Yeah, confirm this. checking.
On Fri, May 13, 2016 at 8:25 PM, George Marques notifications@github.com wrote:
@slapin https://github.com/slapin:
tried release_debug and release targets and platformer demo. Any suggestions for test?
In the OP I stated the platformer demo does not give this error, likely because it's not using binary scenes, only .tscn scenes. The gui/drag_and_drop demo was the first that gave this issue, but any GUI demo seems to cause it.
Also, in the editor with release_debug gives a crash if you add a control, add a new theme to it and try to edit the theme (as seen in
4636 https://github.com/godotengine/godot/issues/4636). This crash
does not happens if built with Clang, nor if built with target=debug.
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/godotengine/godot/issues/4623#issuecomment-219106777
Also I need to mention that duse to alignment issues the code is executed less effectively, which means finding issues and fixing them will speed-up execution.
On my PC Godot crashes on:
0x000000000059ee06 <+1382>: pxor xmm0,xmm0
0x000000000059ee0a <+1386>: add rsi,r10
0x000000000059ee0d <+1389>: lea rdi,[rax+rsi*8+0x4]
0x000000000059ee12 <+1394>: xor esi,esi
0x000000000059ee14 <+1396>: add esi,0x1
0x000000000059ee17 <+1399>: add rdi,0x10
=> 0x000000000059ee1b <+1403>: movaps XMMWORD PTR [rdi-0x10],xmm0
0x000000000059ee1f <+1407>: cmp r8d,esi
movaps
instruction needs the address to be aligned (SSE2) to 16-bytes. The address is currently stored in the rdi
register, in my case it currently has 39440212
. Of course the register can differ depend on compilation. This number is not dividable by 16 - so Godot must crash. Replacing this instruction by movups
in hexeditor (0x0F29 => 0x0F11) which doesn't need the address to be aligned works properly (doesn't crash, the game starts).
Godot crashes here (in DVector<String>::resize()
):
for (int i=oldsize;i<p_size;i++) {
=> memnew_placement(&t[i], T );
}
The address stored in t
is the same as rdi-0x10
in the assembly, so movaps
uses this not 16-byte aligned address. GCC 6 tries to optimize on address from t
which is not 16-byte aligned (let assume that lock.data()
returns 16-byte aligned address, so (int*)lock.data() + 1
will never be 16-byte aligned). I used 16-bytes for reference counter (4-bytes + 12-bytes padding) and Godot works now properly, so I think that this is quite good solution :) The address is now 16-byte aligned, so the aligned SIMD instructions can work properly. I don't know why GCC automatically uses the aligned instructions when memory is not aligned (maybe it knows that the original address is 16-byte aligned?), but I think that 16-byte alignment can be done always in vector (the data alignment can be used everywhere later)...
Could you please provide PR?
On Tue, May 17, 2016 at 3:56 AM, Błażej Szczygieł notifications@github.com wrote:
On my PC Gidit crashes on:
0x000000000059ee06 <+1382>: pxor xmm0,xmm0 0x000000000059ee0a <+1386>: add rsi,r10 0x000000000059ee0d <+1389>: lea rdi,[rax+rsi*8+0x4] 0x000000000059ee12 <+1394>: xor esi,esi 0x000000000059ee14 <+1396>: add esi,0x1 0x000000000059ee17 <+1399>: add rdi,0x10=> 0x000000000059ee1b <+1403>: movaps XMMWORD PTR [rdi-0x10],xmm0 0x000000000059ee1f <+1407>: cmp r8d,esi
movaps instruction needs the address to be aligned (SSE2) to 16-bytes. The address is currently stored in the rdi register, in my case it currently has 39440212. Of course the register can differ depend on compilation. This number is not dividable by 16 - so Godot must crash. Replacing this instruction by movups in hexeditor (0x0F29 => 0x0F11) which doesn't need the address to be aligned works properly (doesn't crash, the game starts).
Godot crashes here (in DVector
::resize()): for (int i=oldsize;i<p_size;i++) {
=> memnew_placement(&t[i], T ); }
The address stored in t is the same as rdi-0x10 in the assembly, so movaps uses this not 16-byte aligned address. GCC 6 tries to optimize on address from t which is not 16-byte aligned (let assume that lock.data() returns 16-byte aligned address, so (int*)lock.data() + 1 will never be 16-byte aligned). I used 16-bytes for reference counter (4-bytes + 12-bytes padding) and Godot works now properly, so I think that this is quite good solution :) The address is now 16-byte aligned, so the aligned SIMD instructions can work properly. I don't know why GCC automatically uses the aligned instructions when memory is not aligned (maybe it knows that the original address is 16-byte aligned?), but I think that 16-byte alignment can be done always in vector (the data alignment can be used everywhere later)...
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/godotengine/godot/issues/4623#issuecomment-219591660
Ok, I'll do it. Btw. Vector
uses the same logic, but I can't reproduce crash on Vector
even if address is not aligned, so compiler doesn't generate vectorized code with aligned instructions (Why? The loop which calls constructors (inline constructor in String
case) is the same as in DVector
)?
PR: #4691
The way Godot uses allocated memory kind of sucks and is not SSE friendly, this will be fixed in 3.0 and should also solve this issue, so re-tagging this one.
Is this still valid in the current master branch?
Is this still valid in the current master branch?
Bump.
Haven't used Linux in a while. I can test it but not soon.
While testing some 2.1 PR got some kind of random crashes with
ERROR: move_child: Invalid new child position
as the last message, using GCC 6.
I have not experienced that with master when I tried a while ago, now got some errors when trying to compile release_debug.
Got some time to test. Can't even open the editor. The project manager opens fine, but when I open even an empty project I get a segfault.
Thread 1 "godot.x11.opt.t" received signal SIGSEGV, Segmentation fault.
__cxxabiv1::__dynamic_cast (src_ptr=0x0, src_type=0x251f468 <typeinfo for Object>,
dst_type=0x1cb6d68 <typeinfo for ScriptEditorBase>, src2dst=0)
at /build/gcc-multilib/src/gcc/libstdc++-v3/libsupc++/dyncast.cc:50
50 /build/gcc-multilib/src/gcc/libstdc++-v3/libsupc++/dyncast.cc: No such file or directory.
(gdb) where
#0 __cxxabiv1::__dynamic_cast (src_ptr=0x0, src_type=0x251f468 <typeinfo for Object>,
dst_type=0x1cb6d68 <typeinfo for ScriptEditorBase>, src2dst=0)
at /build/gcc-multilib/src/gcc/libstdc++-v3/libsupc++/dyncast.cc:50
#1 0x000000000104f529 in ScriptEditor::_update_members_overview() ()
#2 0x0000000001051728 in ScriptEditor::_update_script_names() ()
#3 0x00000000009ad2dc in MethodBind0::call(Object*, Variant const**, int, Variant::CallError&) ()
#4 0x0000000001b304e3 in Object::call(StringName const&, Variant const**, int, Variant::CallError&) ()
#5 0x0000000001b1e542 in MessageQueue::_call_function(Object*, StringName const&, Variant const*, int, bool) ()
#6 0x0000000001b1e7c7 in MessageQueue::flush() ()
#7 0x00000000008a9249 in Main::iteration() ()
#8 0x0000000000897711 in OS_X11::run() ()
#9 0x0000000000891b78 in main ()
With target=debug
the editor runs fine.
When I try to run an empty project with the release template I get this:
Thread 1 "godot.x11.opt.6" received signal SIGSEGV, Segmentation fault.
0x0000000001671620 in atomic_conditional_increment(unsigned int*) ()
(gdb) bt
#0 0x0000000001671620 in atomic_conditional_increment(unsigned int*) ()
#1 0x0000000001594028 in Reference::init_ref() ()
#2 0x00000000015adc6a in Image::load(String const&) ()
#3 0x00000000007b0f2c in Main::start() ()
#4 0x000000000078e75b in main ()
@vnen weird, I was able to compile it on another machine ( 01ed55987 ), just can't run scenes but editor opens (running window closes without any message, like on a recent solved issue I can't find).
The editor is not meant to be compiled with target=release, it needs release_debug to work, so I'm not sure this is a bug anymore..
@reduz this one is not about the editor, it crashed when playing the demos in a release export. The issue with the editor (#4673) happens in release_debug
.
I changed the way memory is allocated since the initial crashes, so it should always be aligned to 16, does the crash still happen?
Don't know. I don't have a Linux installation at the moment, so I can't test it. Maybe @zaps166 knows something about this.
The SSE bugs are fixed, so the only bug remaining was the cast_to<> bug. But this is not really a compiler bug but a godot bug.. so I think it's safe to close this
Compiling Godot for release gives consistent
SIGSEGV
signal when running some of the demos. It seems to be related to the resource loader based on what I get from backtraces.This is what I get in master with the GUI drag and drop demo (I get an identical BT in other GUI demos, such as input mapping and translation):
I went as far as 37af8b4 and still get the same thing, even with a clean build. At that point, the 2d platformer also raises a similar error:
Is
DVector<T>::resize()
the culprit?Also, the 2D platformer does not have such problem in current master, but it was updated to the new
.tscn
format, which may have mitigated the problem. So I guess this can be some compatibility breakage introduced somewhere, so old binary scenes don't load right. It certainly can be something else.And I hope somebody else can reproduce this issue or I may have to burn my computer 👿