VedalAI / neuro-amongus

Among Us Plugin for Neuro-sama
GNU General Public License v3.0
531 stars 50 forks source link

Speed up conversion to nn format #72

Closed owobred closed 1 year ago

owobred commented 1 year ago

This speeds up get_x for Game objects by avoiding recomputing the x values for each point in the window.

Comparision

I've provided a small benchmark on a 2MB file with ~1700 frames, which should hopefully be relatively representative.

Existing version

(neuro-amongus-py3.11) bred@fluff:~/amongie/neuro-amongus$ python AI/test.py 
Parsing: recordings/1684614514.gymbag2
Updating game data
Saving: recordings/decoded/1684614514.pickle
Converting to neural network format...
Conversion took 14.9826s
Saving neural network format...

New version

(neuro-amongus-py3.11) bred@fluff:~/amongie/neuro-amongus$ python AI/test.py 
Parsing: recordings/1684614514.gymbag2
Updating game data
Saving: recordings/decoded/1684614514.pickle
Converting to neural network format...
Conversion took 3.3117s
Saving neural network format...

This version is 4.5x (~11 seconds) faster than the original version.

Mild sanity check

This version produces a file in recordings/data/1684614514.pickle with an md5 that is identical to the hash of the file produced by the original version.

owobred commented 1 year ago

I've also toyed with this in the pad_list function, and gotten ~1 second of speedup on top of the other spedups by avoiding re-evaluating the padding values.

These two lines can have the default calculated before looping, though I haven't checked to ensure its identical to the original yet. https://github.com/VedalAI/neuro-amongus/blob/main/AI/data/proto_defaults.py#L55 https://github.com/VedalAI/neuro-amongus/blob/main/AI/data/proto_defaults.py#L66

owobred commented 1 year ago

I've added some speedups that affect a bit more of the codebase at https://github.com/owobred/neuro-amongus/pull/1, and wanted to get a quick thumbs up before adding them. This provides the same hash between what I'm looking to add to this pr and what's currently in this pr.

Willing to ditch these changes if they're too much of a hassle, but i couldn't find any obvious improvements beyond this point.💖

p.s. the times in the linked pr are slower than the ones in this pr, but only because I'm using a different device (laptop on battery with cpu limited to 50%). If you want to convert times from that pr to times in this pr, you can roughly half the times.

edit: Alex had a look at these other changes, the major speedup breaks existing behaviour and the other speedup is only ~0.4s, which is super low compared to the other time consuming parts of this.