Closed frisam closed 4 years ago
Hi, and thanks for the PR!
I've been working on something along similar lines in the streaming branch, but with a different approach: I parse each frame's data lazily, to pay the parsing penalty only when that frame's data is actually accessed. I also gain some performance by not parsing the whole replay as UBJSON, instead finding the raw
element by looking for a specific byte sequence (hacky, but it's hinted at in the spec and it's what the official JS parser does).
Your branch is still notably faster for getting just the metadata and start/end events, and could be sped up a bit more by using the same hack as above plus skipping directly to the End Game event. But many users will want to do things like accessing the last frame to determine who won, which the lazy-parsing approach handles very well.
And even with the best speedups we could hope for, it'll still be fairly slow to get the metadata for a large directory of replays. A more scalable approach would be to create some sort of index, perhaps with SQLite. I think that's the only way to make things fast enough for an acceptable UX on large replay collections.
But please try out the streaming
branch and let me know what you think. I'm inclined to go with that design for now, but I'm definitely still looking for feedback. I'll probably end up reworking things at least once more as use of py-slippi grows, in any case.
Hey, thanks for the great library.
This PR modifies
slippi.game.Game
to add the ability to parse only the metadata, game start and game end events. This speeds up parsing in situations where you may want to retrieve information about the overall game (characters, ports, stage, etc), but have no need to inspect the actual gameplay frames themselves.My personal use-case is for loading a ton of replays into a database so that I can later search them by matchup, stage, etc. Parsing all the gameplay frames adds some significant processing time that this PR now allows the user to avoid if desired (see performance comparison below).
Changes
partial_parsing=False
to the constructor ofslippi.game.Game
_parse_file_partial
toslippi.game.Game
test_game_partial_parse
, and new helper method,_game_partial
totest/replays.py
Performance Comparison
A quick check from
ipython %timeit
:Notes
All tests are passing.
Based on something Fizzi said on discord, it may be possible to extract the game end event without iterating through the raw stream to get to it. If so this would add some additional speedup as we could stop parsing after the game start event. Possible future improvement.
Thanks & let me know of any input.