GraylinKim / sc2reader

A python library that extracts data from various Starcraft II resources to power tools and services for the SC2 community. Who doesn't want to hack on the games they play?
http://sc2reader.readthedocs.org
MIT License
413 stars 85 forks source link

Implement JSONEncoder. Fixes #117. #157

Closed delta1 closed 3 years ago

delta1 commented 11 years ago

Hi @GraylinKim

Please could you provide feedback on this pull request, let me know if it's in line with your thinking and what is still necessary to do?

Let me know if there are things missing from the JSONEncoder, or what kind of testing is possible and required?

Thanks!

coveralls commented 11 years ago

Coverage Status

Coverage decreased (-0.19%) when pulling fce79f2b39a7397ff7cc4d945d4564beb7d73178 on delta1:featureJSONencoder into 3416b04f75ae9bb78930b379b8f8f77090a55574 on GraylinKim:master.

GraylinKim commented 11 years ago

I just opened a pull request on your pull request to clean it up a bit. More importantly, there are still some issues with the JSONEncoder that I didn't fix. Try running the sc2json script over a few replays and you'll see what I mean.

Traceback (most recent call last):
  File "/home/graylinkim/projects/sc2reader/env/bin/sc2json", line 8, in <module>
    load_entry_point('sc2reader==0.6.3', 'console_scripts', 'sc2json')()
  File "/home/graylinkim/projects/sc2reader/sc2reader/scripts/sc2json.py", line 19, in main
    replay_json = factory.load_replay(args.path[0])
  File "/home/graylinkim/projects/sc2reader/sc2reader/factories/sc2factory.py", line 85, in load_replay
    return self.load(Replay, source, options, **new_options)
  File "/home/graylinkim/projects/sc2reader/sc2reader/factories/sc2factory.py", line 137, in load
    return self._load(cls, resource, filename=filename, options=options)
  File "/home/graylinkim/projects/sc2reader/sc2reader/factories/sc2factory.py", line 148, in _load
    obj = plugin(obj)
  File "/home/graylinkim/projects/sc2reader/sc2reader/factories/plugins/utils.py", line 19, in call
    return func(*args, **opt)
  File "/home/graylinkim/projects/sc2reader/sc2reader/factories/plugins/replay.py", line 16, in toJSON
    return json.dumps(replay, **options)
  File "/usr/lib/python2.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/usr/lib/python2.7/json/encoder.py", line 203, in encode
    chunks = list(chunks)
  File "/usr/lib/python2.7/json/encoder.py", line 437, in _iterencode
    for chunk in _iterencode(o, _current_indent_level):
  File "/usr/lib/python2.7/json/encoder.py", line 428, in _iterencode
    for chunk in _iterencode_dict(o, _current_indent_level):
  File "/usr/lib/python2.7/json/encoder.py", line 402, in _iterencode_dict
    for chunk in chunks:
  File "/usr/lib/python2.7/json/encoder.py", line 326, in _iterencode_list
    for chunk in chunks:
  File "/usr/lib/python2.7/json/encoder.py", line 437, in _iterencode
    for chunk in _iterencode(o, _current_indent_level):
  File "/usr/lib/python2.7/json/encoder.py", line 428, in _iterencode
    for chunk in _iterencode_dict(o, _current_indent_level):
  File "/usr/lib/python2.7/json/encoder.py", line 402, in _iterencode_dict
    for chunk in chunks:
  File "/usr/lib/python2.7/json/encoder.py", line 326, in _iterencode_list
    for chunk in chunks:
  File "/usr/lib/python2.7/json/encoder.py", line 436, in _iterencode
    o = _default(o)
  File "/home/graylinkim/projects/sc2reader/sc2reader/utils.py", line 336, in default
    return json.JSONEncoder.default(self, obj)
  File "/usr/lib/python2.7/json/encoder.py", line 178, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <sc2reader.events.message.PacketEvent object at 0x402d0d0> is not JSON serializable

Instead of using __dict__ for objects like the Observer it might be a better idea to enumerate the fields to be included; similar to how player and replay are working. I'll also suggest that including team information and possibly event information would be a good thing. Including event information will create gigantic output documents but will make people happy.

coveralls commented 11 years ago

Coverage Status

Coverage increased (+0.08%) when pulling 17942d7e2789b700f115bfe4d91b2c23d468f2e3 on delta1:featureJSONencoder into 3416b04f75ae9bb78930b379b8f8f77090a55574 on GraylinKim:master.

GraylinKim commented 10 years ago

Hi @delta1, what's the status on this? Do you plan on finishing?

I'll finish the patch and merge it if you don't plan to do so soon.

delta1 commented 10 years ago

Hey @GraylinKim , sorry about the delay. I don't think I'll be able to do it soon, so please do go ahead when you have a chance. Thank you for the consideration.

headwinds commented 10 years ago

Just curious... is there any harm in committing this work as an experimental branch so that others could possibly pick up the work? I see there is only a master branch...

hey @delta1 I'll take a look at what you've started and maybe I can help you finish it ;-D

headwinds commented 10 years ago

And just thinking out loud... I like what you said here:

"event information will create gigantic output documents but will make people happy."

Yes it will!

How gigantic do you think this json file would be? over a MB? a few mbs? surely not gigs?!?!

But if this file takes a long time to download, that might be an issue especially for mobile... the ggtracker viz is so snappy - it blew my mind how you can scrub the data for a complete game!

It would be great to have JS developers have access to the full lot like that but we should consider how to best deliver it in a timely fashion.

The data event could also possibly be broken up by 2-5 minutes intervals so that you could page through it or we could also investigate using nodejs streams to send the data to the client in chucks instead of having to wait to download one large file.

headwinds commented 10 years ago

@GraylinKim if you do decide to pick it up this weekend, I'll be around too to test and troubleshoot - I'll watch for updates

dsjoerg commented 10 years ago

Hi, GGTracker developer here, thanks :) In GGTracker, the server processes the replay and sends the client only what it needs in order to display the page. I bet the JSON would compress at least 5x, for what it's worth.

Now if you really want to blow people away, get one of those python-in-javascript thingies ( http://www.rfk.id.au/blog/entry/pypy-js-first-steps/) and get sc2reader running in the browser. Then you could have a web page that parses a replay and shows you stuff about it, all with no server at all.

On Fri, Jan 31, 2014 at 9:31 AM, brandon flowers notifications@github.comwrote:

And just thinking out loud... I like you want said here:

"event information will create gigantic output documents but will make people happy."

Yes it will!

How gigantic do you think this json file would be? over a MB? a few mbs? surely not gigs?!?!

But if this file takes a long time to download, that might be an issue especially for mobile... the ggtracker viz is so snappy - it blew my mind how you can scrub the data for a complete game!

It would be great to have JS developers have access to the full lot like that but we should consider how to best deliver it in a timely fashion.

The data event could also possibly be broken up by 2-5 minutes intervals so that you could page through it or we could also investigate using nodejs streams to send the data to the client in chucks instead of having to wait to download one large file.

Reply to this email directly or view it on GitHubhttps://github.com/GraylinKim/sc2reader/pull/157#issuecomment-33798089 .

GraylinKim commented 10 years ago

Just curious... is there any harm in committing this work as an experimental branch so that others could possibly pick up the work? I see there is only a master branch...

It isn't my work so I don't have a branch for it in my repo. You can work off the pull requested branch if you want to pick the work back up

How gigantic do you think this json file would be? over a MB? a few mbs? surely not gigs?!?!

A full dump of information would need to include all the events, units, and unit types in the game. The size of the other information is pretty negligible. Opening up replay from the ASUS ROG tournament this summer, I see:

If you more restrictive about it and only included units, types, and events that directly related to the players in the game you would see:

So maybe the restricted portion of the replay in json might weigh in at about 1.5 megabytes uncompressed for a two player game. You could further restrict what gets serialized but I think at some point you are just writing custom code that shouldn't be packaged.

Ignoring any data transfers, processing the raw replay data on the client side could also be an issue for mobile clients with weaker CPUs and/or javascript engines. GGTracker extracts and formats the information it needs to render pages and only sends that to the users. If you want to be "snappy" I think that is the way to go.

The data event could also possibly be broken up by 2-5 minutes intervals so that you could page through it or we could also investigate using nodejs streams to send the data to the client in chucks instead of having to wait to download one large file.

Anything this customized would need to be 3rd party code. I don't think it makes sense to package code for breaking up data into intervals for paging or streaming data in anyway.

Now if you really want to blow people away, get one of those python-in-javascript thingies

Neat Python -> C -> Javascript tool. His metrics at the bottom indicate that parsing a replay could take over 10 minutes using the described technique. Might be worth waiting for an improved solution.

headwinds commented 10 years ago

The restricted option sounds like a decent starting point - If I had 1.5 MB raw file to start with then I could look at ways of mining it to only send what the view requires but sending 1.5 MB to a desktop isn't that bad either and pretty easy to deal with. How far off is a solution like this?

@dsjoerg your concept sounds awesome and ideal - so when will you be bringing that to your ggtracker API? pro account option? ;-D I just don't think many JS devs would pick up python [ yeah, yeah we're missing out and lazy ] to implement it and would probably prefer either raw json files or to use a service that returns json.

headwinds commented 10 years ago

I've picked up this project again attempting to use sc2reader as a module within a larger python and angular project using google app engine and the sample photo hunt code base. I'm having some success working through the sample API returning static strings and objects but I got stuck when I actually need to talk to sc2reader module.

I'm having a compile problem regarding importing mpyq - I went through the documentation and it can't find a reference about dealing with this apparent dependency. It's a hard dependency right?

In resources.py, it attempts to import mpyq on line 11

So I grabbed the source from: https://github.com/eagleflo/mpyq

I copy and paste the mpyq folder and placed it inside of the sc2reader directory. Basically, my project structure looks like this:

myproject

I'm at a loss on how to get resources.py to pick up the mpyq folder? am I missing some extra configuration somewhere?!

full trace Traceback (most recent call last): File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/runtime/wsgi.py", line 239, in Handle handler = _config_handle.add_wsgi_middleware(self._LoadHandler()) File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/runtime/wsgi.py", line 298, in _LoadHandler handler, path, err = LoadObject(self._handler) File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/runtime/wsgi.py", line 84, in LoadObject obj = import(path[0]) File "/Users/bflowers/Projects/headwinds/sc2jsonhunt/handlers.py", line 20, in import model File "/Users/bflowers/Projects/headwinds/sc2jsonhunt/model.py", line 34, in from sc2reader.scripts import sc2json File "/Users/bflowers/Projects/headwinds/sc2jsonhunt/sc2reader/init.py", line 11, in from sc2reader import factories, log_utils File "/Users/bflowers/Projects/headwinds/sc2jsonhunt/sc2reader/factories/init.py", line 3, in from sc2reader.factories.sc2factory import SC2Factory File "/Users/bflowers/Projects/headwinds/sc2jsonhunt/sc2reader/factories/sc2factory.py", line 26, in from sc2reader.resources import Resource, Replay, Map, GameSummary, Localization File "/Users/bflowers/Projects/headwinds/sc2jsonhunt/sc2reader/resources.py", line 11, in import mpyq ImportError: No module named mpyq INFO 2014-05-11 19:37:20,181 module.py:627] default: "GET /api/themes HTTP/1.1" 500 - INFO 2014-05-11 19:37:20,996 module.py:627] default: "GET /favicon.ico HTTP/1.1" 304 - INFO 2014-05-11 19:37:21,171 module.py:627] default: "GET /favicon.ico HTTP/1.1" 304 - ERROR 2014-05-11 19:37:21,178 wsgi.py:262] Traceback (most recent call last): File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/runtime/wsgi.py", line 239, in Handle handler = _config_handle.add_wsgi_middleware(self._LoadHandler()) File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/runtime/wsgi.py", line 298, in _LoadHandler handler, path, err = LoadObject(self._handler) File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/runtime/wsgi.py", line 84, in LoadObject obj = import(path[0]) File "/Users/bflowers/Projects/headwinds/sc2jsonhunt/handlers.py", line 20, in import model File "/Users/bflowers/Projects/headwinds/sc2jsonhunt/model.py", line 34, in from sc2reader.scripts import sc2json File "/Users/bflowers/Projects/headwinds/sc2jsonhunt/sc2reader/init.py", line 10, in from sc2reader import engine ImportError: cannot import name engine