cocos2d / cocos2d-x

Cocos2d-x is a suite of open-source, cross-platform, game-development tools utilized by millions of developers across the globe. Its core has evolved to serve as the foundation for Cocos Creator 1.x & 2.x.
https://www.cocos.com/en/cocos2d-x
18.21k stars 7.06k forks source link

SEGFAULT Linux/cpp-tests #19021

Open alphaonex86 opened 6 years ago

alphaonex86 commented 6 years ago

Steps to Reproduce:

  1. compile all under gentoo
  2. start build/linux-build/bin/Debug/cpp-tests/cpp-tests under gdb/valgrind
  3. segfault under gdb
  4. valgrind detect well memory access problem
crazyhappygame commented 6 years ago

What test did you run?

alphaonex86 commented 6 years ago

The test: bin/Debug/cpp-tests/cpp-tests

Ans: Start AutoTest

crazyhappygame commented 6 years ago

I think that only Ubuntu and gcc 4.9 is supported. The main reason for this are precompiled libraries. Could you reproduce error on Ubuntu and gcc 4.9?

v1993 commented 6 years ago

@crazyhappygame Precompiled libraries doesn't limit compilers. I use zapcc and clang without problems.

@alphaonex86 Yes, I can confirm problem. Which test executes just before crash?

crazyhappygame commented 6 years ago

@v1993 @alphaonex86 please reproduce problem on Ubuntu and gcc 4.9 because otherwise this problem will have low priority. (I know that in theory gcc 4.9 and 7.3 libs should work together but I also know from practice that it is not always true. )

I have many random crashes in my apps which I can not reproduce it - see it in Google Play only. I have a feeling that it could be related to memory corruption or race condition in cocos2d-x code.

v1993 commented 6 years ago

@crazyhappygame gcc 4.9 is very deperaced. I use ubuntu (well, XUbuntu).

BTW, installer script can't find them even in extra repo in ubuntu 18.04.

What about trying valgrind? It can trace memory corruption pretty good.

alphaonex86 commented 6 years ago

Start AutoTest

Can be reproduced on any platform and memory checked by valgrind.

crazyhappygame commented 6 years ago

@v1993 @alphaonex86 Do you run build/install-deps-linux.sh to setup all dependencies? Could you write bash a script which can be used reproduce this problem? But full bash script which can be run on clean ubuntu instance. (I would like to run in on clean Virtual Machine). There should be everything including download cocos2d-x sources, installing dependencies (cocos2d-x deps, valgrind, compilers etc), command line to run valgrind. I would like to check it but setting cocos2d-x on linux is pain (as I far I know every one has litte bit different procedure... )

crazyhappygame commented 6 years ago

Do you know how to run valgrind on Android? Below some docs: https://source.android.com/devices/tech/debug/valgrind Below latest debug cpp android build: https://ci.appveyor.com/project/minggo/cocos2d-x/build/1.0.1222/job/wwk9c4i5mjlc68nq/artifacts

Running Android cpp-test under valgrind may show another set of interesting issues ... I tried to run Android cpp-test under valgrind but so far no success....

v1993 commented 6 years ago

build/install-deps-linux.sh is very bad on ubuntu 18.04 systems: it fails to install packages and builds and install software that present in repos.

We have to update it (pretty major, it is LTS. Nobody will use old ubuntu just because someonee didn't updated deps scripts).

crazyhappygame commented 6 years ago

I run drmemory (https://github.com/DynamoRIO/drmemory ) for win32 debug build and found only (+ lots false positives which I removed):

Error #2182: UNINITIALIZED READ: reading 0x04959600-0x04959604 4 byte(s)
# 0 libcocos2d.dll!cocos2d::ui::EditBoxImplWin::cleanupEditCtrl         [g:\cocos2d-x-3.17\cocos\ui\uieditbox\uieditboximpl-win32.cpp:98]
# 1 libcocos2d.dll!cocos2d::ui::EditBoxImplWin::createEditCtrl          [g:\cocos2d-x-3.17\cocos\ui\uieditbox\uieditboximpl-win32.cpp:111]
# 2 libcocos2d.dll!cocos2d::ui::EditBoxImplWin::createNativeControl     [g:\cocos2d-x-3.17\cocos\ui\uieditbox\uieditboximpl-win32.cpp:139]
# 3 libcocos2d.dll!cocos2d::ui::EditBoxImplCommon::initWithSize         [g:\cocos2d-x-3.17\cocos\ui\uieditbox\uieditboximpl-common.cpp:81]
# 4 libcocos2d.dll!cocos2d::ui::EditBox::initWithSizeAndBackgroundSprite [g:\cocos2d-x-3.17\cocos\ui\uieditbox\uieditbox.cpp:103]

#2364: UNINITIALIZED READ: reading register ecx
# 0 Camera3DTestDemo::SwitchViewCallback               [g:\cocos2d-x-3.17\tests\cpp-tests\classes\camera3dtest\camera3dtest.cpp:219]
# 1 Camera3DTestDemo::onEnter                          [g:\cocos2d-x-3.17\tests\cpp-tests\classes\camera3dtest\camera3dtest.cpp:352]
# 2 libcocos2d.dll!cocos2d::Director::setNextScene     [g:\cocos2d-x-3.17\cocos\base\ccdirector.cpp:1232]
# 3 libcocos2d.dll!cocos2d::Director::drawScene        [g:\cocos2d-x-3.17\cocos\base\ccdirector.cpp:315]
# 4 libcocos2d.dll!cocos2d::Director::mainLoop         [g:\cocos2d-x-3.17\cocos\base\ccdirector.cpp:1517]
# 5 libcocos2d.dll!cocos2d::Application::run           [g:\cocos2d-x-3.17\cocos\platform\win32\ccapplication-win32.cpp:112]
crazyhappygame commented 6 years ago

@v1993 travis https://docs.travis-ci.com/user/reference/overview/ support ubuntu trusty. But there is travis support for docker https://docs.travis-ci.com/user/docker. Could switch current build system in travis (based on trusty and gcc 4.9) to docker and latest ubuntu18.04? https://github.com/cocos2d/cocos2d-x/blob/v3/.travis.yml https://github.com/cocos2d/cocos2d-x/tree/v3/tools/travis-scripts

JohnCoconut commented 5 years ago

Please ignore this post at the moment, working on it

I have been struggling with this issue for quite sometime on my Linux machine. It failed randomly every now and then.

I am running Fedora Linux 30, gcc 9.1.1 on cocos2d-x master branch.

I think the crash is caused by double free in AudioEngineImpl::uncacheAll().

https://github.com/cocos2d/cocos2d-x/blob/9b45665ae532db1acc13c71887a1a2ec0feb3d3f/cocos/audio/linux/AudioEngine-linux.cpp#L294-L304

The problem is in AudioUncacheInFinishedCB test.

https://github.com/cocos2d/cocos2d-x/blob/9b45665ae532db1acc13c71887a1a2ec0feb3d3f/tests/cpp-tests/Classes/NewAudioEngineTest/NewAudioEngineTest.cpp#L1107-L1120

AudioEngine::uncacheAll() is called both at onEnter() and onExit(), causing sound->release() to be invoked twice.

Commenting out AudioEngine::uncacheAll(); in onEnter() does consistently prevent crashing, but it fails to uncache. Map members variables are not cleared if uncacheAll is not called.

minggo commented 5 years ago

The map is cleared, why sound->release() will be invoked twice?

JohnCoconut commented 5 years ago

@minggo I was wrong. Please ignore my post above. I will make changes to it after I find the cause.

I run a debugger on it and found audioRef.channel->stop(); was called on a null channel pointer. Perhaps that's the reason for crashing?

What I found interesting was that, sometimes it didn't crash even though channel == nullptr. I thought it would segfault immediately when dereferencing nullptr.

JohnCoconut commented 5 years ago

Found a null pointer dereferencing here.

https://github.com/cocos2d/cocos2d-x/blob/9b45665ae532db1acc13c71887a1a2ec0feb3d3f/cocos/audio/linux/AudioEngine-linux.cpp#L107-L118

preload function sets channel to nullptr in line 109, and dereferencing it in line 112.

https://github.com/cocos2d/cocos2d-x/blob/9b45665ae532db1acc13c71887a1a2ec0feb3d3f/cocos/audio/linux/AudioEngine-linux.cpp#L331

resume in line 115 sets channel value to non nullptr.

minggo commented 5 years ago

Yep, it seems has a problem, but it may crash easily here, right?

JohnCoconut commented 5 years ago

Yes. It crashes easily if we add an assert(mapChannelInfo[id].channel) statement.

But I think the bigger culprit is data race. When I turned on thread sanitizer on gcc, it reported 258 cases of data races.

Below is the log from thread sanitizer if you're interested.

thread-sanitizer.log