mellowagain / shiro

High performance, high quality osu!Bancho C++ re-implementation
GNU Affero General Public License v3.0
40 stars 6 forks source link

Improve performance for certain /web/ routes #126

Open mellowagain opened 5 years ago

mellowagain commented 5 years ago

I'm currently measuring execution time of the major parts of Shiro including routes and handlers. Most of the routes and handlers execute in less than 1ms which is already faster than all publicly available osu!Bancho re-implementations. This is good and reaches our expectations and goals very clearly. Here are the logs from the performance regression: shiro 2019-05-18 13:24:39.log

The following handlers however did not perform to the expectations and need to be optimized / redesigned to perform better:

The above two cases need to be optimized for a seamless player experience. Please discuss below ways on how we would go on and optimize these methods.

Machine this performance regression was executed on

Hardware: **CPU**: AMD Ryzen 5 1600 (12) @ 3.200GHz **Memory**: 16'052MiB (~16 GB) DDR4 2800MHz **Graphics Card**: NVIDIA GeForce GTX 1060 6GB **Disk**: Shiro was run off an HDD Software: **OS**: Arch Linux **Kernel**: Linux Zen 5.0.10-zen1-1-zen **Compiler**: Clang 8.0.0 **Mode**: Debug (unoptimized) **MySQL**: 10.3.14-MariaDB

hazel0177 commented 5 years ago

This is usually caused by the osu!api being slow, gatari (attempts) to get every map possible and enter it into the database, and is why they're very fast. I see that is being the only possible way to improve speed.

mellowagain commented 5 years ago

Performance for score submission can be improved by implementing caching between score submissions for common values such as the beatmap difficulty (prevents re-calculation of beatmap difficulty).

mellowagain commented 5 years ago

I've benchmarked score submission on my machine and came to the following timings (in seconds, lowest to highest):

0.000000    Ranked/Passed checks
0.000000    User stats refreshing (memory)
0.000001    Stats refresh (memory)
0.000002    Replay exists check
0.000006    Flag checking
0.000012    Field checks
0.000035    Validity checks
0.000040    Table display building
0.000084    Score construction
0.000102    Score decryption to String
0.000664    Score struct construction
0.000822    Multiform parsing
0.000901    PP calculation
0.001040    Searching score in db (dup check)       #   1.040ms
0.001485    Replay saving                           #   1.485ms
0.002346    Stats refresh (db)                      #   2.346ms
0.002550    Score overwriting                       #   2.550ms
0.002807    Adding score to db                      #   2.807ms
0.003356    Init table display                      #   3.356ms
0.007840    #1 bot check in #announce               #   7.840ms
0.009854    User validity checks                    #   9.854ms
0.019503    Beatmap fetch + stats refresh (db)      #  19.503ms
0.240826    PP + Acc recalculation                  # 240.826ms
-----------------------------------------------------------------
0.294276    Total                                   # 294.276ms

Details:

As we can see, the most time is taken up by beatmap fetching from db as well as PP and accuracy recalculation.