Closed waldotf closed 1 year ago
I'm getting the following when I try to load SP
What do you get when launching with +developer 1
as well as having your logging level (into ../cfg/source-python/core_settings.ini
) set to 5
? Also, what is the output of version
?
Your ../addons/source-python/data/source-python/*
directory appears to be missing entirely. My first guess, would be that the data update failed somehow (or, at the very least, its extraction didn't succeed). Try to delete your ../addons/source-python/data/source-python-data.zip
file and restart your server.
That seems to be it; here's the first boot after deleting the .zip
:
Another restart looks the same as the original log. Not sure if it's what matters here, but here's the installed zlib
:
root@71a0d0effe7a:/# dpkg -l | grep zlib
ii zlib1g:i386 1:1.2.11.dfsg-1+deb10u2 i386 compression library - runtime
Another restart looks the same as the original log.
That's because SP assumes the data is up-to-date since the zip's checksum is the same. A quick google search for cat: hlds.7.pid: No such file or directory
yielded the following thread as first result: tf2 crashing 'cat: hlds.13915.pid: No such file or directory'. Perhaps it can points you in the right direction.
Maybe the server is unable to extract the new data, because of a permission problem?
I tried a few extra library packages suggested in various posts without success. All of srcds/
is owned by the user which is running the server process.
I pared out _unpack_data
from https://github.com/Source-Python-Dev-Team/Source.Python/blob/1f87696909a95ec1177c6b96f7602b2ec75fdb87/addons/source-python/packages/source-python/core/update.py#L311 into a simple test.py
:
from zipfile import ZipFile
def _unpack_data(path):
"""Unpack ``source-python-data.zip`` into the given path.
:param Path path:
The path the data file should be unpacked into.
"""
with ZipFile("source-python-data.zip") as zip:
zip.extractall(path)
_unpack_data("source-python-data-test")
and that extracts fine as the same user, using the system Python (3.7.3). If I put that in tf/addons/source-python/data/source-python
and restart, everything works as expected (i.e. this seems to be the only issue).
Since it's getting to this log line https://github.com/Source-Python-Dev-Team/Source.Python/blob/1f87696909a95ec1177c6b96f7602b2ec75fdb87/addons/source-python/packages/source-python/core/update.py#L317 and failing to extract, my guess is it's something specific to SP's Python install. Is there an easy way to invoke that?
Looks like Zipfile
isn't threadsafe in our version (fix only backported as far as 3.7). Not sure if that's relevant at all.
The crash is definitely isolated to that call, but I can't reproduce it outside of SP and I don't know how to get a proper backtrace for it with gdb
because of all the forking in srcds
.
Looks like
Zipfile
isn't threadsafe in our version (fix only backported as far as 3.7). Not sure if that's relevant at all.
Very unlikely, because the extraction is done on the main thread.
The crash is definitely isolated to that call, but I can't reproduce it outside of SP and I don't know how to get a proper backtrace for it with
gdb
because of all the forking insrcds
.
If you look at the debug.log
generated, it should contain the entire stacktrace. I suspect one of the call is not resolved from the right shared library likely due to a mis-configurated multiarch or something. More than likely zlib
(or one of its dependencies). Try to run ldd
on addons/source-python/Python3/lib-dynload/zlib.cpython-36m-i386-linux-gnu.so
.
srcds/debug.log
is just entries that look like this:
----------------------------------------------
CRASH: Wed Feb 15 06:12:54 UTC 2023
Start Line: ./srcds_linux -game tf -debug -port 27025 +map jump_beef -maxplayers 24 +developer 1 -norestart
End of Source crash report
----------------------------------------------
Doesn't look like any of these are the wrong arch, but I don't know what the missing libpython
means.
# ldd tf/addons/source-python/Python3/lib-dynload/zlib.cpython-36m-i386-linux-gnu.so
linux-gate.so.1 (0xf7fc6000)
libz.so.1 => /lib/i386-linux-gnu/libz.so.1 (0xf7f97000)
libpython3.6m.so.1.0 => not found
libpthread.so.0 => /lib/i386-linux-gnu/libpthread.so.0 (0xf7f76000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf7d98000)
/lib/ld-linux.so.2 (0xf7fc8000)
Actually, I think I know exactly why it happens. Try the following:
export LD_PRELOAD=/addons/source-python/Python3/plat-linux/libz.so.1
srcds_run -game tf ...
Same segfault with the preloaded zlib
.
Try: [removed attachment]
Oops, looks like I was invoking srcds_run
weirdly and that prevented it from properly preloading -- using either zlib
1.2.3 or 1.2.5 fixes the issue.
The system (Debian 10) version was 1.2.11, which didn't work. Debian 9 repos (out of support since 2020-07-06) have 1.2.8, which did work.
Thanks a ton for the help.
Theoretically, pre-loading your installed version (/lib/i386-linux-gnu/libz.so.1
) should do the trick as well, but you may face different issues in different contexts. Actually, for science, could you test it out? I don't currently have time to go in details, but I will explain what is going on and why it happens in a few hours.
Preloading the installed library version does make it work, that's some cool black magic.
Alright so, as promised, here are some details. The problem is that, on TF2, replay_srv.so
is statically linking to zlib 1.2.5
:
And is globally exporting all of its symbols:
This is problematic because, while the initial call to inflateInit2 properly resolve to zlib
, the subsequent calls it internally does are resolved to replay_srv
. More precisely, the following call into libz.inflateInit2
:
ret = inflateReset2(strm, windowBits);
Is effectively invoking replay_srv.inflateReset2
instead of libz.inflateReset2
. So, if the installed version of zlib
is different than 1.2.5
, the behaviours are basically undefined because structures and behaviours may or may not differ. In this specific case here, z_stream_s.zalloc()
appears to actually call z_stream_s.zfree()
instead... which obviously is not good, heh.
Anyways, pre-loading zlib
ensure all symbols we dynamically lookup resolve before the ones exported by replay_srv
.
In conclusion, although I'm quite confident pre-loading any version should suffice, I'm not 100% sure it won't result into different issues (e.g. pointers allocated from 1.2.5
being passed around the pre-loaded one, etc. if any other binary is dynamically resolving symbols). So, using 1.2.5
is probably for the best so that everything is consistent regardless of the resolutions.
That's pretty annoying; I guess it makes sense though. I'm baking this into a Docker image so I'll just include the 1.2.5
binary and preload it to be safe (and avoid ancient repo purgatory). I'm definitely impressed you figured it out.
Doesn't seem like this is really an SP issue at all, just another "srcds is jank" thing. Still, thanks for the help and hopefully this is helpful to someone in the future.
Out of curiosity, what is the console output you get with the following (without preloading anything, and deleting your source-python-data.zip
so that it is being extracted): [removed attachment]
Looks like that works as intended.
Nice! That means programmatically preloading zlib
prior to loading Python
is enough to take priority over replay_srv
. I included 1.2.11
because this is the version Python 3.6.1
is built against. Moreover, zlib
's manual states that as long as the first digit matches, it should be backward compatible. The other way around not so much due to missing symbols as we've seen here. I will push the changes shortly.
This is almost certainly a system configuration issue on my end, but I'm getting the following when I try to load SP:
As you'd expect, this causes many plugin features to misbehave/segfault/etc. Guessing this is a system config issue, though I do have the required libs installed. Here's
sp info
:Separately, the registration process for the forums seems to be broken (no activation emails get sent). I retried a few times over a day or so.