donadigo / TMTrackNN

Building TrackMania tracks with neural networks.
GNU General Public License v3.0
88 stars 6 forks source link

Error when preprocessing: list index out of range #3

Open Czechball opened 3 years ago

Czechball commented 3 years ago

Hello, I tried to process a big batch of replays (9706 replays). I started with python3 preprocessing.py -i replays/ -o out/ and it successfully processed 3649 replays, but then failed with the following error:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 51, in starmapstar
    return list(itertools.starmap(args[0], args[1]))
  File "preprocessing.py", line 101, in process_fname
    replay_file = Gbx(fname)
  File "/home/czechball/Downloads/TMTrackNN/core/gbx.py", line 103, in __init__
    self._read_node(self.class_id, -1, bp)
  File "/home/czechball/Downloads/TMTrackNN/core/gbx.py", line 531, in _read_node
    self._read_node(_class_id, idx, bp)
  File "/home/czechball/Downloads/TMTrackNN/core/gbx.py", line 516, in _read_node
    self.read_ghost(game_class, bp)
  File "/home/czechball/Downloads/TMTrackNN/core/gbx.py", line 682, in read_ghost
    sample_sz = sample_sizes[i]
IndexError: list index out of range
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "preprocessing.py", line 145, in <module>
    entries = p.starmap(process_fname, it)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 372, in starmap
    return self._map_async(func, iterable, starmapstar, chunksize).get()
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
IndexError: list index out of range

It failed when processing this replay: 7027.Replay.Gbx.zip

I'm on Manjaro 5.8.18-1-MANJARO, my Python version is 3.8.6.

Czechball commented 3 years ago

UPDATE: On the next run (after excluding 7027.Replay.Gbx), it failed after 3647 replays on 1877.Replay.Gbx.zip - maybe it fails after a certain treshold rather than on specific replays? In the previous run when 7027 failed, 1877 was processed succesfully.

donadigo commented 3 years ago

Both replays parse succesfully for me so there must be something with processing or it's the fact that my local version of this repo has major changes and I just haven't gotten arround to updating it. For now, I recommend removing failing replays and if that doesn't work, reduce the number of replays parsed, sorry!

Czechball commented 3 years ago

Removing failing replays didn't work, but limiting the total number to 3500 did. It's a shame that I can't use a bigger dataset, but the results are pretty good anyways. Oh and by the way, thank you for your work on this project. Some of the tracks I've generated so far using my dataset are looking very nice.