SFTtech / openage

Free (as in freedom) open source clone of the Age of Empires II engine 🚀
http://openage.dev
Other
12.69k stars 1.12k forks source link

Python 3.8 Multiprocessing issue / converting media stuck on graphics/35.slp #1223

Closed ericdude4 closed 1 year ago

ericdude4 commented 4 years ago

MacOS Catalina 10.15.1 Assets: Age of Empires 2 Conquerors

The conversion seems to work until it gets to graphics/35.slp and then it just hangs

heinezen commented 4 years ago

Huh, that's weird. For clarification: Does 35.slp still get converted (i.e. there's a 35.slp.png file in assets/converted/graphics/) or does it only convert up to that point?

ericdude4 commented 4 years ago

@heinezen Oddly, although the conversion output seems to imply that some .slp files are being successfully imported, there doesn't seem to be a assets/converted/graphics/ directory.

Here's the output:

...
INFO [py] [1331/3605] sounds/6442.wav
INFO [py] [1332/3605] sounds/6443.wav
INFO [py] [1333/3605] sounds/6444.wav
INFO [py] [1334/3605] sounds/6445.wav
INFO [py] [1335/3605] graphics/2.slp
INFO [py] [1336/3605] graphics/5.slp
INFO [py] [1337/3605] graphics/8.slp
INFO [py] [1338/3605] graphics/9.slp
INFO [py] [1339/3605] graphics/12.slp
INFO [py] [1340/3605] graphics/21.slp
INFO [py] [1341/3605] graphics/22.slp
INFO [py] [1342/3605] graphics/23.slp
INFO [py] [1343/3605] graphics/24.slp
INFO [py] [1344/3605] graphics/33.slp
INFO [py] [1345/3605] graphics/34.slp
INFO [py] [1346/3605] graphics/35.slp

Heres the contents of the assets/converted directory:

blending_modes.docx   gamedata              player_palette.docx   string_resources.docx
blendomatic           gamespec_version      sounds                termcolors.docx

I have a 6 core machine, and I see that the conversion happens in parallel. I am seeing that there are 12 .slp files here which begin conversion. With hyperthreading on 12 cores that makes sense. I'm guessing that none of the .spl files are converting.

I should add that I am using an old portable version of AoE2:Conquerors that I used to LAN with friends in high school. However, looking in the data directory - everything seems to be there.

TheJJ commented 4 years ago

Can you invoke the conversion with more verbosity (-v or -vv) please? Then we should see where the process gets stuck.

heinezen commented 4 years ago

If there's no graphics directory, then this is probably caused by the multi-core conversion routine. Does your processor have 6 cores/12 threads? Edit: Nevermind, you already mentioned that

ericdude4 commented 4 years ago

Wow! I just noticed that the interface.drs - in my game's data directory is spelled interfac.drs 🤦‍♂

Edit: Tried this, seems that interfac.drs is the correct spelling

ericdude4 commented 4 years ago

@TheJJ Is there a convert command which I can append those flags to? I am using make run and the -vv flags just show me version info

heinezen commented 4 years ago

Use bin/run convert --help for a list of commands, e.g. bin/run convert -v

heinezen commented 4 years ago

This issue might be related to #1091 (on Windows).

ericdude4 commented 4 years ago

With -v:

INFO [py] [1332/3605] sounds/6443.wav
 DBG [T3] [py] Wavefile: Total length: 54392    Header length: 44   Trailer length: 76
 DBG [T3] [py] Resampling necessary: input rate: 22050  coding rate: 24000
 DBG [T3] [py] Resampled data length: 59070
 DBG [T3] [py] Starting encoding loop.
INFO [py] [1333/3605] sounds/6444.wav
 DBG [T12] [py] Wavefile: Total length: 14834   Header length: 44   Trailer length: 76
 DBG [T12] [py] Resampling necessary: input rate: 22050 coding rate: 24000
 DBG [T12] [py] Resampled data length: 16014
 DBG [T12] [py] Starting encoding loop.
INFO [py] [1334/3605] sounds/6445.wav
INFO [py] [1335/3605] graphics/2.slp
INFO [py] [1336/3605] graphics/5.slp
INFO [py] [1337/3605] graphics/8.slp

Looks like that last 6445.wav file doesn't have any debug info following it. 🤷‍♂

heinezen commented 4 years ago

Can you try the workaround from the top post #1091 and see if that changes anything? The file mentioned should be bin/openage/convert/slp_converter_pool.py

ericdude4 commented 4 years ago

Here is the contents of my data directory:

BlkEdge.Dat       TileEdge.Dat      gamedata_COPY.drs list.crx
FilterMaps.dat    blendomatic.dat   gamedata_x1.drs   shadow.col
LoQMaps.dat       closedpw.exe      graphics.drs      sounds.drs
Load              empires2.dat      interfac.drs      sounds_x1.drs
PatternMasks.dat  empires2_x1.dat   lightMaps.dat     terrain.drs
STemplet.dat      gamedata.drs      list.cr           view_icm.dat
ericdude4 commented 4 years ago

@heinezen Sorry, I don't completely follow. I see that the has_x1_p1 function was merged into master a while ago. There is no suggestion in the top post. Did you send a link to the correct thread?

heinezen commented 4 years ago

ups sry, I had the numbers of the issue flipped. i wanted to mention #1091 :D

ericdude4 commented 4 years ago

Cool! Progress. Making that change gave me a stack trace:

Traceback (most recent call last):
  File "run.py", line 15, in init run
    main()
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/__main__.py", line 129, in main
    return args.entrypoint(args, cli.error)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/main.py", line 644, in main
    if not convert_assets(outdir, args, srcdir):
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/main.py", line 192, in convert_assets
    for current_item in convert(args):
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/driver.py", line 120, in convert
    yield from convert_media(args)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/driver.py", line 291, in convert_media
    yield from concurrent_chain(
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/util/threading.py", line 48, in concurrent_chain
    raise value
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/util/threading.py", line 106, in generator_to_queue
    for item in generator:
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/driver.py", line 419, in convert_mediafile
    convert_slp(filepath, dirname, names_map, converter_pool, args)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/driver.py", line 360, in convert_slp
    texture = converter_pool.convert(indata, cutter)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/convert/slp_converter_pool.py", line 102, in convert
    return Texture(SLP(slpdata), self.palette, custom_cutter=custom_cutter)
  File "openage/convert/slp.pyx", line 138, in openage.convert.slp.SLP.__init__
    spam(frame_info)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/log/__init__.py", line 101, in spam
    logging.spam(msg, *args, **kwargs)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/log/__init__.py", line 68, in _spam
    self._log(SPAM, msg, args, **kwargs)  # pylint: disable=W0212
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/logging/__init__.py", line 1577, in _log
    self.handle(record)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/logging/__init__.py", line 1587, in handle
    self.callHandlers(record)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/logging/__init__.py", line 1649, in callHandlers
    hdlr.handle(record)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/logging/__init__.py", line 950, in handle
    self.emit(record)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/log/__init__.py", line 45, in emit
    log_cpp(cpp_level, record.getMessage(), record.filename, record.funcName, record.lineno)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/logging/__init__.py", line 367, in getMessage
    msg = str(self.msg)
  File "openage/convert/slp.pyx", line 208, in openage.convert.slp.FrameInfo.__repr__
    "% 4d" % self.version,
TypeError: %d format: a number is required, not bytes

In the concurrent environment, this crash probably didn't stop the overall process.

heinezen commented 4 years ago

This is an error, but I don't think it's the one that caused your initial crash. Can you run again without the -v flag?

ericdude4 commented 4 years ago

Compilation worked! I ran into another issue now. It may be that my version of AoE is just messed. I feel bad for taking up your time for my probably one-off issue, so I am willing to give up. However, here is the stack trace I am getting: (I am not a python guy)

Traceback (most recent call last):
  File "run.py", line 15, in init run
    main()
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/__main__.py", line 129, in main
    return args.entrypoint(args, cli.error)
  File "/Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none/openage/game/main.py", line 78, in main
    return run_game(args, root)
  File "openage/game/main_cpp.pyx", line 47, in openage.game.main_cpp.run_game
    result = run_game_cpp(args_cpp)
  File "", line 0, in __pyx_pw_7openage_4game_8main_cpp_1run_game(_object*, _object*, _object*)+0xd71
  File "", line 0, in openage::run_game(openage::main_arguments const&)+0x184
  File "", line 0, in openage::Engine::Engine(openage::util::Path const&, int, bool, char const*)+0xff6
  File "", line 0, in openage::gui::GUI::GUI(SDL_Window*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, openage::gui::EngineQMLInfo*)+0x60d
  File "", line 0, in openage::shader::Program::link()+0x39
  File "", line 0, in openage::shader::Program::check(unsigned int)+0x18d
  File "", line 0, in openage::error::Error::Error(openage::log::message const&, bool, bool)+0x111
  File "libopenage/shader/program.cpp", line 97, in void openage::shader::Program::check(GLenum)
    auto errormsg = MSG(err);
openage.cppinterface.exctranslate.CPPException: Program validation failed
Validation Failed: Current draw framebuffer is invalid.
heinezen commented 4 years ago

Looks like you've reached #1207 which is still not solved unfortunately :/ I fixed your other bug with the version output though in #1224.

For the multiprocessing issue: Don't know why it locks up on Windows and macOS, but still works on Linux. This might be an upstream issue.

ericdude4 commented 4 years ago

if #1222 isn't fixed as you are saying, I can create an issue with my stack trace included. Would love to get this game running on my machine :) Major opportunities for some AoE games w/ friends during this COVID isolation.

heinezen commented 4 years ago

That would be great! :) Macs are always hard to test on because I don't own one .

(Also the issue I wanted to refer to was #1207. Me and numbers don't go well today.)

ericdude4 commented 4 years ago

Closing in favour of https://github.com/SFTtech/openage/issues/1091 for the concurrency related issue, and https://github.com/SFTtech/openage/issues/1207 for the Mac execution error

simonsan commented 4 years ago

@ericdude4 By any chance, were you using Python 3.8.2 to compile and run openage?

ericdude4 commented 4 years ago

@ericdude4 By any chance, were you using Python 3.8.2 to compile and run openage?

@simonsan I was using Python 3.8.0

simonsan commented 4 years ago

I could think of that this issue is more related to Python 3.8.0 due to something like on macOS, the spawn start method is now used by default in multiprocessing

https://www.python.org/downloads/release/python-382/

@ericdude4 Could you try using Python 3.7.x and see if it doesn't work as well just to be sure?

TheJJ commented 4 years ago

Weird.. If this is indeed the case, then I don't think it's a bug in our code. Just changing the Python interpreter's start method shouldn't change the behavior. https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods Let's see if we can figure this out. @ericdude4 we count on your help :D

ericdude4 commented 4 years ago

ericdude4 to the rescue! Getting the following error on make on current master branch:

➜  openage git:(master) ✗ make
[  0%] openage.codegen: generating c++ code
[  0%] Automatic MOC for target libopenage
[  1%] Building CXX object libopenage/CMakeFiles/libopenage.dir/versions/versions.cpp.o
/Users/ericfroese/Downloads/openage/libopenage/versions/versions.cpp:3:10: fatal error: 'gnu/libc-version.h' file not found
#include <gnu/libc-version.h>
         ^~~~~~~~~~~~~~~~~~~~
1 error generated.
make[3]: *** [libopenage/CMakeFiles/libopenage.dir/versions/versions.cpp.o] Error 1
make[2]: *** [libopenage/CMakeFiles/libopenage.dir/all] Error 2
make[1]: *** [all] Error 2
make: *** [build] Error 2

Getting this error with python 3.7 and 3.8 Following these instructions as usual: https://github.com/SFTtech/openage/blob/master/doc/build_instructions/os_x_10.14_mojave.md

simonsan commented 4 years ago

@ericdude4 Can you build the commit before the last one, we're currently working on fixing that. :-)

ericdude4 commented 4 years ago

@simonsan It's still producing the same multithreading error with Python 3.7.0, however, when I run the ./configure script, I am getting this info at the end of the output (even though python --version outputs 3.7.0)

openage 0.4.0.125

   version string | v0.4.0-125-g6467e268
         compiler | Clang 8.0.1
       build type | Release
         cxxflags |  -stdlib=libc++  -Wall -Wextra -pedantic -Wno-gnu-statement-expression
 build type flags | -DNDEBUG -O3
        build dir | /Users/ericfroese/Downloads/openage/.bin/clang++-release-Oauto-sanitize-none
   install prefix | /usr/local
py install prefix | /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages

(notice py install prefix | /Library/Frameworks/Python.framework/Versions/3.8/...) Is there somewhere I need to clear a cache or something? Unfortunately not a Python guy

simonsan commented 4 years ago

You could try doing make mrproperer to be sure everything is clean and then run configure again. Let's see if py install prefix changes then

ericdude4 commented 4 years ago

Yep! It worked: py install prefix | /Users/ericfroese/.pyenv/versions/3.7.0/lib/python3.7/site-packages

Also, the conversion seems to have worked properly now after switching to python 3.7 🥳

INFO [py] asset conversion complete; asset version: 7

heinezen commented 1 year ago

Multiprocessing was replaced with a different implementation a while ago, so this bug shouldn't conme up anymore.