official-stockfish / fishtest

The Stockfish testing framework
https://tests.stockfishchess.org/tests
270 stars 126 forks source link

Dynamic duplicate workers detection. #2052

Closed vdbergh closed 3 weeks ago

vdbergh commented 4 weeks ago

This is a PR on top of #2050

vdbergh commented 3 weeks ago

Rebased on top of #2050

vondele commented 3 weeks ago

in addition to the successful run reported https://github.com/official-stockfish/fishtest/pull/2050#issuecomment-2153135294 current logs look good:

$ sudo journalctl -u fishtest@6543 --since "120 minutes ago"  | grep -v "Request_task: the server is currently" | grep -v "dead task: run: http" | grep -v "Failed_task: failur" | grep -v "Request_task: refresh queue"  | grep -v 'Validate_random_run: validated aggregated data in cache'
-- Logs begin at Mon 2023-11-27 14:51:35 UTC, end at Fri 2024-06-07 05:44:01 UTC. --
Jun 07 03:44:04 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 03:46:50 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 03:50:05 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 03:52:55 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 03:56:40 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 03:58:52 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:02:28 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:04:54 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:07:55 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:12:36 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:14:05 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:17:53 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:20:38 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 117 active workers...
Jun 07 04:22:57 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:26:49 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:29:26 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 117 active workers...
Jun 07 04:32:09 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:35:14 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:37:58 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:40:38 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:43:55 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:47:16 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:49:45 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 04:52:35 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 117 active workers...
Jun 07 04:56:05 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 118 active workers...
Jun 07 05:00:17 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 117 active workers...
Jun 07 05:01:28 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 116 active workers...
Jun 07 05:04:54 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 117 active workers...
Jun 07 05:07:32 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 117 active workers...
Jun 07 05:11:19 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 117 active workers...
Jun 07 05:13:24 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 116 active workers...
Jun 07 05:16:34 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 116 active workers...
Jun 07 05:19:57 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 115 active workers...
Jun 07 05:20:25 tests.stockfishchess.org pserve[22683]: Update_task: task 666297cb935933703a99c5b9/0 is not active
Jun 07 05:23:32 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 116 active workers...
Jun 07 05:25:52 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 116 active workers...
Jun 07 05:28:27 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 114 active workers...
Jun 07 05:32:00 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 115 active workers...
Jun 07 05:34:54 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 111 active workers...
Jun 07 05:37:37 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 114 active workers...
Jun 07 05:40:57 tests.stockfishchess.org pserve[22683]: Clean_wtt_map: 114 active workers...

under normal load the api looks good, in particular request_task is now cheap:

$ sudo bash ana_access_log.sh
# logging from [07/Jun/2024:04:21:31 +0000] to [07/Jun/2024:05:44:31 +0000] ########
# duration (seconds)            : 4980
# calls in total                : 10000
# calls per second              : 2
# calls not reaching the backend: 30
# calls handled by the backend  : 9970
#                         route    calls      total    average    minimum    maximum
                         /tests      466    324.723      0.697      0.007      1.423
                       /actions       18     37.811      2.101      0.046     18.013
                   /tests/user/       41     26.797      0.654      0.328      3.595
                   /api/actions     1079     24.887      0.023      0.002      0.082
               /api/update_task     4375     22.547      0.005      0.001      0.414
                      /api/beat     2597     11.346      0.004      0.001      0.322
                /api/upload_pgn      246      9.451      0.038      0.005      2.647
                /tests/finished       62      9.352      0.151      0.046      0.241
                     /tests/run        9      5.775      0.642      0.001      2.272
                   /tests/view/      134      4.918      0.037      0.003      0.620
               /api/active_runs      132      4.179      0.032      0.018      0.109
              /api/request_task      257      3.510      0.014      0.003      0.071
               /tests/live_elo/       49      2.989      0.061      0.007      0.456
                /tests/machines        5      2.445      0.489      0.355      0.618
                  /api/get_elo/       82      2.260      0.028      0.009      0.319
                  /tests/stats/       21      1.162      0.055      0.017      0.137
              /api/request_spsa       45      1.161      0.026      0.007      0.182
                  /contributors        4      0.686      0.171      0.143      0.214
           /api/request_version      257      0.553      0.002      0.000      0.036
                         /user/        4      0.533      0.133      0.028      0.432
                      /api/pgn/       18      0.299      0.017      0.003      0.095
                       /api/nn/       39      0.218      0.006      0.003      0.010
                         /login       12      0.187      0.016      0.003      0.050
               /user_management        2      0.149      0.074      0.002      0.147
                           /nns        4      0.102      0.026      0.015      0.034
                      /workers/        1      0.054      0.054      0.054      0.054
                  /tests/modify        2      0.033      0.017      0.016      0.017
               /api/failed_task        1      0.023      0.023      0.023      0.023
                    /tests/stop        1      0.023      0.023      0.023      0.023
          /contributors/monthly        1      0.022      0.022      0.022      0.022
                 /tests/approve        1      0.022      0.022      0.022      0.022
                  /tests/tasks/        1      0.021      0.021      0.021      0.021
                  /tests/delete        1      0.016      0.016      0.016      0.016
                          /user        1      0.010      0.010      0.010      0.010
                        /signup        1      0.009      0.009      0.009      0.009
                              /        1      0.009      0.009      0.009      0.009

server load is negligible.

LGTM.

ppigazzini commented 3 weeks ago

Thank you @vdbergh , great serie of PRs!