Aleph-One-Marathon / alephone

Aleph One is the open source continuation of Bungie’s Marathon 2 game engine.
https://alephone.lhowon.org/
GNU General Public License v3.0
650 stars 99 forks source link

Segmentation faults with 20150907 alephone #16

Closed orbea closed 8 years ago

orbea commented 8 years ago

I installed the 20150907 versions of alephone and the marathon games on slackware-current and found that all three marathon games will segmentation fault when starting. So I tried 20150620 version of alephone and found the issue was non-existent and everything worked as intended.

Aleph One Linux 2015-09-07 1.3a1
https://alephone.lhowon.org/

Original code by Bungie Software <http://www.bungie.com/>
Additional work by Loren Petrich, Chris Pruett, Rhys Hill et al.
TCP/IP networking by Woody Zenfell
Expat XML library by James Clark
SDL port by Christian Bauer <Christian.Bauer@uni-mainz.de>

This is free software with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
For details, see the file COPYING.

Built with network play enabled.

Built with Lua scripting enabled.
Segmentation fault

An example of one of the default scripts provided with the slackbuilds to start the games.

#!/bin/sh

ALEPHONE_DATA="/usr/share/AlephOne/gamedata/Marathon"
export ALEPHONE_DATA

exec alephone "$@"

The slackbuilds can be found here. http://slackbuilds.org/repository/14.1/games/alephone/ http://slackbuilds.org/repository/14.1/games/marathon-data/ http://slackbuilds.org/repository/14.1/games/marathon2-data/ http://slackbuilds.org/repository/14.1/games/marathon-infinity-data/

LidMop commented 8 years ago

I, too, always get segfaults on startup, but only with 20151009 compiled with Boost 1.56 or later (up to 1.60, the latest). The stack trace given below is for the Boost 1.60 case, but stack traces for earlier versions are essentially the same (only minor details in the Boost parts differ).

FUNCTION                                                   FILE                                                LINE
std::__cxx11::basic_string::_Alloc_hider::_Alloc_hider.....c++/5/bits/basic_string.h............................109
std::__cxx11::basic_string::basic_string...................c++/5/bits/basic_string.h............................400
boost::property_tree::basic_ptree::basic_ptree.............boost/property_tree/detail/ptree_implementation.hpp..193
InfoTree::InfoTree.........................................InfoTree.h............................................40
boost::range_detail::any_iterator::dereference.............boost/range/detail/any_iterator.hpp..................512
boost::iterators::iterator_core_access::dereference........boost/iterator/iterator_facade.hpp...................549
boost::iterators::detail::iterator_facade_base::operator*..boost/iterator/iterator_facade.hpp...................655
boost::foreach_detail_::deref..............................boost/foreach.hpp....................................771
_ParseAllMML...............................................XML_MakeRoot.cpp......................................98
ParseMMLFromFile...........................................XML_MakeRoot.cpp.....................................168
_ParseMMLDirectory.........................................shell.cpp...........................................1576
LoadBaseMMLScripts.........................................shell.cpp...........................................1589
initialize_application.....................................shell.cpp............................................554
main.......................................................shell.cpp............................................360

If built with Boost 1.55, 1.54, or 1.53, both 20151009 and 20150907 always exit with error code 1 at line 559 in shell.cpp instead of segfaulting. Side note: the preceding line in that file is supposed to output "Can't find required text strings (missing MML?)" to stderr but that string fails to show up in my terminal when the program is run (???).

The results given above are for Lubuntu 15.10 x64 in a VirtualBox virtual machine. I always supply the path to an appropriately-versioned scenario data folder as a program argument (this M2 version or this Infinity version for 20150907 and this M2 version or this Infinity version for 20151009).

In comparison, 20150620 works perfectly with any version of Boost.

I also build Aleph One natively on Windows (Windows 7 x64) using my own custom build configuration. This method is, of course, not supported, but the results are interesting: 20151009 works perfectly if built with Boost 1.55, but segfaults in the same way as described above if built with later Boost versions.

On the Linux side, I believe I must be doing something wrong because (I assume) at least some people are able to get at least 20150907 working. I just wish I knew what it was....

treellama commented 8 years ago

I think the problem is in here:

InfoTree::const_child_range InfoTree::children_named(std::string key) const
{

    return InfoTree::const_child_range(_children_named_helper(key, begin(), end()),
                                       _children_named_helper(key, end(), end()));
}

In particular, I don't trust the second iterator there. I suspect (but can't yet prove) some equality test fails in BOOST_FOREACH and it ends up dereferencing the end iterator, causing the crash.

The whole children_named setup is particularly obscure. If the goal is to make something that can be easily passed to BOOST_FOREACH, instead of using a standard for loop across InfoTree::equal_range or the filter_iterator directly, maybe it would be OK to copy_if InfoTree references to a list and return that?

Any ideas, Hopper?

Hopper262 commented 8 years ago

After some trouble getting Boost 1.60 built for my machine, I reproduced the problem and it's now fixed for me. I switched to equal_range and used a Boost.Range adaptor to handle the rest.

Brevity for the caller is the top priority for children_named, as this pattern is repeated over 200 times throughout the code. That's why it bends over backwards to not only make looping easy, but to unpack the key-value pairs you'd get from a normal traversal.

If it's still segfaulting, please reopen this issue.

orbea commented 8 years ago

Thanks, this seems to have fixed it.

I applied commits https://github.com/Aleph-One-Marathon/alephone/commit/718fe46902106bdd15e6655817eccd8eb15948b8 and https://github.com/Aleph-One-Marathon/alephone/commit/c0fc7680221af28794357e92d1db6ebd83b1adcb as patches to the 20150907 tarball and the three main scenarios no longer segfault. Additionally I tested Marathon eternal, evil, red, rubicon and tempus irae which all start.

LidMop commented 8 years ago

That fixed it for me as well (both Linux and Windows). Thank you!