Closed jamie-marchant closed 6 years ago
This also happens in Debian 9, using the July 17th build
@bibendovsky any thoughts? Caused by your changes or...?
Can't reproduce. Tried the first level (yavin1b) on Windows x86/x64, Ubuntu x64 and Fedora x64. Used 71325abca0f5371d7745b69decad9ae2e88cc716 for tests.
@bibendovsky did you use a newer build? Maybe it's been fixed now?
@jamie-marchant I used latest source code to build binaries.
@bibendovsky so you're saying if we use 'Merge pull request #799 from deathsythe47/master' it will fix the problem? Forgive I'm new to this GitHub stuff.
@jamie-marchant OpenJK is already at this commit. But this one in particular didn't affect your problem, whatever was last week's build should be sufficient, if not tomorrows shouldn't be any different as far as SP Saves are concerned.
Uh, I see, I still have an issue on the build from last week. The question is why is "bibendovsky" not having this issue. Odd as it seems could be it application bit type? I am using 64-bit.
As you see above, @bibendovsky is using 64-bit as well.
Yes the OS is 64-bit but what about the game?
I figure the game is implied to match.
If that is the case I don't know why it can't be reproduced.
@jamie-marchant Do you have issues on the first level (yavin1b) too?
Steps I tried to reproduce the issue:
At this point, if ICARUS has empty state you, for example, can't cut the tree for Rosh, trigger a cutscene before a first door (after killing some Howlers), etc.
Yes I do have this problem on the first level.
Maybe it's GCC issue? Build compiled with GCC 4.7.4 (Ubuntu x64) also does not have such problem. I don't have GCC 4.7.2 (like used in https://builds.openjk.org/) to test.
Noticed that log https://jk.xd.cm/builders/linux-64/builds/858/steps/make/logs/stdio is quite small. Maybe forcing full rebuild can resolve the problem?
This is the log from your weekly build? How easily can you make a full rebuild?
Would it help if I posted a game log? I can consistently reproduce this bug.
@jamie-marchant I don't know what is "weekly build". All tests done with builds compiled by myself.
By weekly builds I mean the ones here: https://builds.openjk.org/,(I thought they where built weekly) those are the ones I use. If you are building your own, where does that log file come from? I don't remember getting a copy to compile myself.
That log file came from the weekly build.
Uh, I see. Thanks for explaining that.
UPDATE: I've updated to Fedora 26 but still have the problem.
Hello everyone!
I have experienced this bug, and have done some testing on it recently. So let me describe the problem as I see it.
I tested builds from builds.openjk.org, as well as builds made by myself on Debian 9 (with gcc 6.3) and on virtual machine running Debian 7 (with gcc 4.7 and pulled from Debian 8 repos gcc-4.9). All builds were tested on my primary Debian 9 system.
The bug did not appear in my Debian 9 build, but did appear in builds from builds.openjk.org as well as in builds from Debian 7 virtual machine with both versions of compiler. Later testings revealed the following problem.
The game code (at least its cgame module) contains some global variables that store some game state related parameters. These variables need to be reinitialized on each level change (and, of course, on each savegame load). This is done via reloading the whole game engine dynamic library, jagamex86_64.so. The problem of those "bugged" builds is that this library, for some reasons, didn't get unloaded (even though the corresponding dlclose() were actually called) and, hence, didn't get reinitialized. This led to some problems with the game behavior.
A brief google search on this problem gave an idea that this may be a result of usage of STB_GNU_UNIQUE symbols by gcc in the corresponding library (see this StackOverflow answer). It can be switched off by -fno-gnu-unique gcc option, and indeed, when I built the game with this compiler option on my Debian 7 virtual machine with gcc-4.9, I got binaries with the library unloading working properly and, hence, without the bug. However, this option seems to be absent in gcc-4.7, so it will be useless for the build server in its current state. Moreover, this means that we cannot use this method to guarantee a proper game build on some old systems. But, as for now, I can not suggest any other options to fix this. Maybe you can see any better options.
Moreover, I still can not explain why this bug appears only on some systems, as the gnu-unique option seems to be turned on by default in all gcc builds I have used for my tests.
That's an interesting find. I should point out that the cgame isn't even a separate module with SP. It's part of the same shared object you mentioned but has a separate initialization that happens as though it was it's own module.
Seems relevant: https://stackoverflow.com/questions/8792363/c-dlclose-doesnt-unload-the-shared-library
Well, done dmitrio95! We now know what the problem is! So the question is can the GCC be upgraded on the server? Does anyone else have any alternative ideas? I don't know enough about GCC, sorry.
Upgrading to newer gcc sounds like putting a bandaid on instead of fixing the issue.
Uh I see.
Thanks for the in-depth investigation @dmitrio95.
This problem has come up before a long time ago, and e8fc6378 was my original fix for the problem.
It got broken a while later when the compiler flags I added were removed.. What's interesting though is the commit I referenced in my commit message: ioquake/jedi-academy@d6bc68d which says in its commit message:
Fixed dclose problems (stuck in level 4)
- UNIQUE symbols in jk2gamex86.so don't allow to unload the library on map changes!
From what you're saying though @dmitrio95 it seems the dclose
problem isn't actually a bug, but an unexpected behaviour which still complies with the POSIX standard.
I don't have Linux available right now - what does running readelf -Ws ./jagamex86.so | grep UNIQUE
output currently? (or for which ever library had the problem)
Building the game with gcc 4.7 with -fvisibility=hidden
flag seems to produce binaries without that bug. Why were those flags removed?
As for checking the library with readelf -Ws ./jagamex86.so | grep UNIQUE
, this command produced no output on builds that worked in my previous tests, and "broken" builds yielded the following output:
~/Public/OpenJK/install/JediAcademy/OpenJK $ readelf -Ws jagamex86_64.so | grep UNIQUE
114: 0000000000000000 0 OBJECT UNIQUE DEFAULT UND _ZNSs4_Rep20_S_empty_rep_storageE@GLIBCXX_3.4 (3)
665: 0000000000d47640 8 OBJECT UNIQUE DEFAULT 26 _ZGVZNK11CGPProperty11GetTopValueEvE5empty
2002: 0000000000d3d0e0 24 OBJECT UNIQUE DEFAULT 26 _ZZN7CBezier11DrawSegmentEPfS0_ffE7lastEnd
2500: 0000000000d47650 16 OBJECT UNIQUE DEFAULT 26 _ZZNK11CGPProperty11GetTopValueEvE5empty
3924: 0000000000d47650 16 OBJECT UNIQUE DEFAULT 26 _ZZNK11CGPProperty11GetTopValueEvE5empty
4322: 0000000000d47640 8 OBJECT UNIQUE DEFAULT 26 _ZGVZNK11CGPProperty11GetTopValueEvE5empty
4715: 0000000000000000 0 OBJECT UNIQUE DEFAULT UND _ZNSs4_Rep20_S_empty_rep_storageE@@GLIBCXX_3.4
4752: 0000000000d3d0e0 24 OBJECT UNIQUE DEFAULT 26 _ZZN7CBezier11DrawSegmentEPfS0_ffE7lastEnd
It is interesting that for the build done with -fvisibility=hidden
this command found some UNIQUE symbols, but this didn't stop the library from being unloaded just in time. Perhaps objects from libstdc++ are used.
~/Public/OpenJK/install_vis_hidden/JediAcademy $ readelf -Ws OpenJK/jagamex86_64.so | grep UNIQUE
123: 0000000000000000 0 OBJECT UNIQUE DEFAULT UND _ZNSs4_Rep20_S_empty_rep_storageE@GLIBCXX_3.4 (3)
5507: 0000000000000000 0 OBJECT UNIQUE DEFAULT UND _ZNSs4_Rep20_S_empty_rep_storageE@@GLIBCXX_3.4
Tried two builds recently (openjk-2017-12-11-eba428e6-linux-64 and openjk-2017-12-25-ae3900fc-linux-64) on Kubuntu 17.04 64bit and this bug still persists. Reproducing it is easy - just die and load a game anywhere, even on Yavin_1. Some things like scripted events and objects that can be moved by the force that are necessary to progress just stop working so, unfortunately, this is a game breaking bug (for example, happened with the second tree that needs to be cut down on Yavin and the crane at the Java sandcrawler).
After reading the above thread, there seems to be a bunch of solutions to the problem. Could you guys incorporate them in the never builds? This issue seems to be ignored for a long time now.
Well, it was an interesting discussion, but the cause of the problem seems to be simpler.
As @xycaleth mentioned, this problem was fixed a long time ago with the addition of -fvisibility=hidden
project-global compiler option. Later this was a bit changed (bace05c9b3da11a92d7f14ac53a69f06a7015804), and this compiler option is now set up on a per-target basis rather than project-wide. The problem comes from the way it is done: for every build target that needs this flag to be set up it is appended to its COMPILE_OPTIONS
property :
set_property(TARGET ${SPGame} APPEND PROPERTY COMPILE_OPTIONS ${OPENJK_VISIBILITY_FLAGS})
However, CMake v. 2.8.9, which is declared to be minimal supported CMake version in CMakeLists.txt, does not support this property, so, when using this version of CMake, the symbol visibility flag does not get added to the actual compiler options list. Debian 7, which is running on the OpenJK build server, contains exactly this version of CMake, so we can now experience this problem.
So, I guess, we have three options here:
COMPILE_OPTIONS
targets property). This will require update of CMake on build server.COMPILE_FLAGS
instead of COMPILE_OPTIONS
. This property, however, is deprecated, so it is not the best option here.Should be fixed by #972
Reporting a bug? Please make sure you've given the following information - thanks!
Operating system and version: Fedor Core 26(64-bit), game version: OpenJK 1.0.1.1(64-bit executable)
Is this for single player or multiplayer? Single player, Jedi Night Academy
Description of the bug (and if possible, steps to reproduce the bug):
What did you expect to happen instead? Everything functions after loading a save game after death.