GraylinKim / sc2reader

A python library that extracts data from various Starcraft II resources to power tools and services for the SC2 community. Who doesn't want to hack on the games they play?
http://sc2reader.readthedocs.org
MIT License
413 stars 85 forks source link

"2207 bytes left!" on newer replays. #190

Open MLLeKander opened 8 years ago

MLLeKander commented 8 years ago

I get an error when trying to parse recent replay files, such as this one.

The output of sc2parse using the lotv branch:

$ sc2parse IEM10-Taipei_Byun-sOs_G5.SC2Replay
dealing with IEM10-Taipei_Byun-sOs_G5.SC2Replay

IEM10-Taipei_Byun-sOs_G5.SC2Replay
Total failure parsing 3.1.1.39948
[ERROR] 2207 bytes left!
[ERROR] 2207 bytes left!
Traceback (most recent call last):
  File "/home/michael/Documents/sc2reader/sc2reader/scripts/sc2parse.py", line 78, in main
    replay = sc2reader.load_replay(path, debug=True, load_level=2)
  File "/home/michael/Documents/sc2reader/sc2reader/factories/sc2factory.py", line 85, in load_replay
    return self.load(Replay, source, options, **new_options)
  File "/home/michael/Documents/sc2reader/sc2reader/factories/sc2factory.py", line 137, in load
    return self._load(cls, resource, filename=filename, options=options)
  File "/home/michael/Documents/sc2reader/sc2reader/factories/sc2factory.py", line 146, in _load
    obj = cls(resource, filename=filename, factory=self, **options)
  File "/home/michael/Documents/sc2reader/sc2reader/resources.py", line 271, in __init__
    self._read_data(data_file, self._get_reader(data_file))
  File "/home/michael/Documents/sc2reader/sc2reader/resources.py", line 618, in _read_data
    self.raw_data[data_file] = reader(data, self)
  File "/home/michael/Documents/sc2reader/sc2reader/readers.py", line 128, in __call__
    raise ValueError("{0} bytes left!".format(data.length-data.tell()))
ValueError: 2207 bytes left!

If it's relevant, this is the output from the master branch:

$ sc2parse IEM10-Taipei_Byun-sOs_G5.SC2Replay
dealing with IEM10-Taipei_Byun-sOs_G5.SC2Replay

IEM10-Taipei_Byun-sOs_G5.SC2Replay
Total failure parsing 3.1.1.39948
[ERROR] 'utf8' codec can't decode byte 0xfc in position 87: invalid start byte
[ERROR] 'utf8' codec can't decode byte 0xfc in position 87: invalid start byte
Traceback (most recent call last):
  File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/scripts/sc2parse.py", line 78, in main
    replay = sc2reader.load_replay(path, debug=True, load_level=2)
  File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/factories/sc2factory.py", line 85, in load_replay
    return self.load(Replay, source, options, **new_options)
  File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/factories/sc2factory.py", line 137, in load
    return self._load(cls, resource, filename=filename, options=options)
  File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/factories/sc2factory.py", line 146, in _load
    obj = cls(resource, filename=filename, factory=self, **options)
  File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/resources.py", line 271, in __init__
    self._read_data(data_file, self._get_reader(data_file))
  File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/resources.py", line 601, in _read_data
    self.raw_data[data_file] = reader(data, self)
  File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/readers.py", line 33, in __call__
    ) for i in range(data.read_bits(5))],
  File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/decoders.py", line 252, in read_aligned_string
    return self._buffer.read_string(count, encoding)
  File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/decoders.py", line 108, in read_string
    return self.read_bytes(count).decode(encoding)
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xfc in position 87: invalid start byte
dsjoerg commented 8 years ago

I think the lotv branch is behind the times unfortunately.

My branch https://github.com/dsjoerg/sc2reader/tree/upstream is more up-to-date.

On Mon, Feb 8, 2016 at 3:36 AM, OMGTallMonster notifications@github.com wrote:

I get an error when trying to parse recent replay files, such as this one http://lotv.spawningtool.com/10851/.

The output of sc2parse using the lotv branch:

$ sc2parse IEM10-Taipei_Byun-sOs_G5.SC2Replay dealing with IEM10-Taipei_Byun-sOs_G5.SC2Replay

IEM10-Taipei_Byun-sOs_G5.SC2Replay Total failure parsing 3.1.1.39948 [ERROR] 2207 bytes left! [ERROR] 2207 bytes left! Traceback (most recent call last): File "/home/michael/Documents/sc2reader/sc2reader/scripts/sc2parse.py", line 78, in main replay = sc2reader.load_replay(path, debug=True, load_level=2) File "/home/michael/Documents/sc2reader/sc2reader/factories/sc2factory.py", line 85, in load_replay return self.load(Replay, source, options, _new_options) File "/home/michael/Documents/sc2reader/sc2reader/factories/sc2factory.py", line 137, in load return self._load(cls, resource, filename=filename, options=options) File "/home/michael/Documents/sc2reader/sc2reader/factories/sc2factory.py", line 146, in _load obj = cls(resource, filename=filename, factory=self, _options) File "/home/michael/Documents/sc2reader/sc2reader/resources.py", line 271, in init self._read_data(data_file, self._get_reader(data_file)) File "/home/michael/Documents/sc2reader/sc2reader/resources.py", line 618, in _read_data self.raw_data[data_file] = reader(data, self) File "/home/michael/Documents/sc2reader/sc2reader/readers.py", line 128, in call raise ValueError("{0} bytes left!".format(data.length-data.tell())) ValueError: 2207 bytes left!

If it's relevant, this is the output from the master branch:

$ sc2parse IEM10-Taipei_Byun-sOs_G5.SC2Replay dealing with IEM10-Taipei_Byun-sOs_G5.SC2Replay

IEM10-Taipei_Byun-sOs_G5.SC2Replay Total failure parsing 3.1.1.39948 [ERROR] 'utf8' codec can't decode byte 0xfc in position 87: invalid start byte [ERROR] 'utf8' codec can't decode byte 0xfc in position 87: invalid start byte Traceback (most recent call last): File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/scripts/sc2parse.py", line 78, in main replay = sc2reader.load_replay(path, debug=True, load_level=2) File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/factories/sc2factory.py", line 85, in load_replay return self.load(Replay, source, options, _new_options) File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/factories/sc2factory.py", line 137, in load return self._load(cls, resource, filename=filename, options=options) File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/factories/sc2factory.py", line 146, in _load obj = cls(resource, filename=filename, factory=self, _options) File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/resources.py", line 271, in init self._read_data(data_file, self._get_reader(data_file)) File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/resources.py", line 601, in _read_data self.raw_data[data_file] = reader(data, self) File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/readers.py", line 33, in call ) for i in range(data.read_bits(5))], File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/decoders.py", line 252, in read_aligned_string return self._buffer.read_string(count, encoding) File "/home/michael/Documents/SC_RT/sc2reader/sc2reader/decoders.py", line 108, in read_string return self.read_bytes(count).decode(encoding) File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0xfc in position 87: invalid start byte

— Reply to this email directly or view it on GitHub https://github.com/GraylinKim/sc2reader/issues/190.

MLLeKander commented 8 years ago

Thanks for the heads up. Your branch works!

stevemao commented 7 years ago

I got the similar error when trying your branch and the branches here:

sc2parse MyReplay.SC2Replay
dealing with MyReplay.SC2Replay

MyReplay.SC2Replay
Total failure parsing 3.16.0.55505
[ERROR] 'utf-8' codec can't decode byte 0xfc in position 74: invalid start byte
[ERROR] 'utf-8' codec can't decode byte 0xfc in position 74: invalid start byte
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/scripts/sc2parse.py", line 44, in main
    replay = sc2reader.load_replay(path, debug=True, load_level=1)
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/factories/sc2factory.py", line 85, in load_replay
    return self.load(Replay, source, options, **new_options)
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/factories/sc2factory.py", line 137, in load
    return self._load(cls, resource, filename=filename, options=options)
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/factories/sc2factory.py", line 146, in _load
    obj = cls(resource, filename=filename, factory=self, **options)
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/resources.py", line 271, in __init__
    self._read_data(data_file, self._get_reader(data_file))
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/resources.py", line 618, in _read_data
    self.raw_data[data_file] = reader(data, self)
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/readers.py", line 36, in __call__
    ) for i in range(data.read_bits(5))],
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/readers.py", line 36, in <listcomp>
    ) for i in range(data.read_bits(5))],
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/decoders.py", line 252, in read_aligned_string
    return self._buffer.read_string(count, encoding)
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/decoders.py", line 108, in read_string
    return self.read_bytes(count).decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 74: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/scripts/sc2parse.py", line 78, in main
    replay = sc2reader.load_replay(path, debug=True, load_level=2)
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/factories/sc2factory.py", line 85, in load_replay
    return self.load(Replay, source, options, **new_options)
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/factories/sc2factory.py", line 137, in load
    return self._load(cls, resource, filename=filename, options=options)
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/factories/sc2factory.py", line 146, in _load
    obj = cls(resource, filename=filename, factory=self, **options)
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/resources.py", line 271, in __init__
    self._read_data(data_file, self._get_reader(data_file))
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/resources.py", line 618, in _read_data
    self.raw_data[data_file] = reader(data, self)
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/readers.py", line 36, in __call__
    ) for i in range(data.read_bits(5))],
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/readers.py", line 36, in <listcomp>
    ) for i in range(data.read_bits(5))],
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/decoders.py", line 252, in read_aligned_string
    return self._buffer.read_string(count, encoding)
  File "/usr/local/lib/python3.6/site-packages/sc2reader-0.7.0rc0-py3.6.egg/sc2reader/decoders.py", line 108, in read_string
    return self.read_bytes(count).decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 74: invalid start byte
dsjoerg commented 7 years ago

@stevemao if you post the replay somewhere I'll give it a try on my branch and see if there's something easy to fix. By the way, what kind of replay was this? (SC2 LotV? Ladder? 1v1?)

stevemao commented 7 years ago

Try this one: http://lotv.spawningtool.com/29785/

dsjoerg commented 7 years ago

Hi @stevemao I just uploaded that replay to GGTracker, which indicates that my branch https://github.com/ggtracker/sc2reader/tree/upstream can parse the replay OK. http://ggtracker.com/matches/7166787

Oneoeigh commented 7 years ago

Hi @dsjoerg I just tried your branch and got the similar error: "UnicodeDecodeError: 'utf8' codec can't decode byte 0xfc in position 27: invalid start byte" The platform is OSX 10.10.2, and the python version is 2.7.6. Besides I also tried upload the replay to ggtracker.com and it worked well, so I guess maybe there are some problems about compatibility.

BTW: In the "creeptracker.py" there is some code that do these kind of things: "from Image import xxx" that may not work on many computers, which can issue an error "No module named Image". To me I think this may be better for compatibility:

try:
    from Image import xxx
except:
    from PIL.Image import xxx
dsjoerg commented 7 years ago

OK this is crazy/interesting. I committed a test to the suite (https://github.com/ggtracker/sc2reader/commit/a15a5f56d9e3aa5df2dfa211ab6afa77a4b5e4d7) that demonstrates that sc2reader can parse the replay. And yet when I run sc2parse on the replay I get the same failure that @stevemao shows. Tomorrow I'll look closer and figure out what's going on, unless someone beats me to it.

dsjoerg commented 7 years ago

OK @stevemao @MLLeKander everything is working.

In my environment, it turned out to be a PYTHONPATH problem that made it so that I was sometimes accidentally running the wrong version of sc2reader!

Once I fixed things so that I was running the correct version, all replays are parsed successfully. As I mentioned yesterday, I committed a test the demonstrates that sc2reader can parse the replay provided by @stevemao (http://lotv.spawningtool.com/29785/).

I also just tried the original replay provided by @MLLeKander (http://lotv.spawningtool.com/10851/) and it also parsed fine for me, and I uploaded it to GGTracker (http://ggtracker.com/matches/6447606) as well.

I'm leaving this issue open, because this issue is about GraylinKim/sc2reader, not ggtracker/sc2reader.

dsjoerg commented 7 years ago

And by the way everyone, please use https://github.com/ggtracker/sc2reader/tree/upstream and not https://github.com/dsjoerg/sc2reader/tree/upstream !!