hohav / peppi

Rust parser for Slippi SSBM replay files
MIT License
37 stars 9 forks source link

Use array to store payload sizes for speed #25

Closed NickCondron closed 8 months ago

NickCondron commented 1 year ago

The command byte is a u8, and each size is a u16, so we can store the size of every possible command in a small (512 bytes) array [u16; 256]. This saves us a Hashmap lookup on every event. My testing shows a ~2.5% speedup.

Before: flamegraph-before

After: flamegraph-sizes

NickCondron commented 1 year ago

Based on my benchmarking this change results in a ~7% speedup when parsing into a game and a ~12% speedup when parsing with an event handler. Seems like a big win! I went ahead and tested NonZeroU16 did provide a tiny improvement over Option<u16> of < 0.2% or so.

hbox_llod_timeout_g8_into_game
  Instructions:           228878779 (-11.65034%)
  L1 Accesses:            339200619 (-9.667089%)
  L2 Accesses:               320200 (-1.431430%)
  RAM Accesses:             1089478 (+0.736934%)
  Estimated Cycles:       378933349 (-8.685836%)

hbox_llod_timeout_g8_event_handlers
  Instructions:           165737319 (-15.40507%)
  L1 Accesses:            237312073 (-13.32839%)
  L2 Accesses:                  291 (+19.75309%)
  RAM Accesses:                1062 (-1.392758%)
  Estimated Cycles:       237350698 (-13.32660%)

ics_ditto_into_game
  Instructions:           151026555 (-10.59003%)
  L1 Accesses:            228851062 (-8.592390%)
  L2 Accesses:               313904 (-0.058264%)
  RAM Accesses:              623199 (-0.003530%)
  Estimated Cycles:       252232547 (-7.859051%)

ics_ditto_event_handlers
  Instructions:           108070308 (-14.20154%)
  L1 Accesses:            156062729 (-12.17231%)
  L2 Accesses:                  180 (+24.13793%)
  RAM Accesses:                 715 (+0.562588%)
  Estimated Cycles:       156088654 (-12.17038%)

long_pause_into_game
  Instructions:           764217806 (-11.54634%)
  L1 Accesses:           1131889265 (-9.580791%)
  L2 Accesses:              1555764 (-0.670766%)
  RAM Accesses:             3643603 (-0.677830%)
  Estimated Cycles:      1267194190 (-8.706986%)

long_pause_event_handlers
  Instructions:           582404252 (-14.62366%)
  L1 Accesses:            836003678 (-12.60904%)
  L2 Accesses:                  305 (+16.85824%)
  RAM Accesses:                1092 (-0.727273%)
  Estimated Cycles:       836043423 (-12.60852%)

mango_zain_netplay_into_game
  Instructions:           129177700 (-11.84776%)
  L1 Accesses:            194393823 (-9.698410%)
  L2 Accesses:               327275 (-0.345301%)
  RAM Accesses:              538075 (-0.110086%)
  Estimated Cycles:       214862823 (-8.866528%)

mango_zain_netplay_event_handlers
  Instructions:            96613111 (-15.23302%)
  L1 Accesses:            138470208 (-13.16489%)
  L2 Accesses:                 1298 (+3.261734%)
  RAM Accesses:                2380 (+3.074924%)
  Estimated Cycles:       138559998 (-13.15602%)

old_ver_thegang_into_game
  Instructions:            98175222 (-10.71166%)
  L1 Accesses:            149994783 (-8.627745%)
  L2 Accesses:               307984 (-0.289435%)
  RAM Accesses:              544252 (-0.000735%)
  Estimated Cycles:       170583523 (-7.668546%)

old_ver_thegang_event_handlers
  Instructions:            68708535 (-14.63328%)
  L1 Accesses:             99191266 (-12.55465%)
  L2 Accesses:                  158 (+26.40000%)
  RAM Accesses:                 641 (+0.156250%)
  Estimated Cycles:        99214491 (-12.55193%)

short_game_tbh10_into_game
  Instructions:             1056699 (-10.81377%)
  L1 Accesses:              1639355 (-8.597501%)
  L2 Accesses:                 5197 (+0.600077%)
  RAM Accesses:                5814 (-0.496320%)
  Estimated Cycles:         1868830 (-7.661529%)

short_game_tbh10_event_handlers
  Instructions:              723570 (-15.02956%)
  L1 Accesses:              1048259 (-12.86684%)
  L2 Accesses:                  319 (+20.83333%)
  RAM Accesses:                1056 (-1.400560%)
  Estimated Cycles:         1086814 (-12.48491%)
hohav commented 8 months ago

Merged via 7ef0402b3a89a978fffb5563d3a0f43be4181ac6. Thanks!