official-stockfish / fishtest

The Stockfish testing framework
https://tests.stockfishchess.org/tests
281 stars 130 forks source link

Periodically validate Fishtest's internal data structures. #2070

Closed vdbergh closed 3 months ago

vdbergh commented 3 months ago

In this way we guarantee that the schemas in schemas.py are kept up-to-date and hence preserve their documentary value.

Also: switch to string keys for the worker_runs dict. This is what we do elsewhere and it makes the schema more readable:

worker_runs_schema = {
    uuid: {
        run_id: True,
        "last_run": run_id,
    }
}

Also: retire the separate variable "purge_count" and include it in the "active_run" dict.

Requires upgrade of vtjson for the set datatype used in "unfinished_runs_schema".

ppigazzini commented 3 months ago

I wonder if this type of check should be blocking and done once in CI. In this way we are suggesting that it's fine to run the server with some outdated schemas.

vdbergh commented 3 months ago

This is to catch for example a developer using run["_id"] instead of str(run["_id"]) as key when inserting something in a dictionary. It may take a long time before this is noticed. I have been burnt by this myself.

It cannot be done only at startup since many of the data structures start out empty and they are filled while the server is running.

vdbergh commented 3 months ago

In other words I see no other way to enforce that the data structures used in Fishtest correspond to the schemas in schemas.py. Sure, currently they do, but as the data structures in Fishtest evolve, people will "forget" to update the corresponding schemas so that they will become outdated.

These are the "non-db" schemas verified in this PR:

cache_schema = {
    run_id: {
        "run": runs_schema,
        "is_changed": bool,  # Indicates if the run has changed since last_sync_time.
        "last_sync_time": ufloat,  # Last sync time (reading from or writing to db). If never synced then creation time.
        "last_access_time": ufloat,  # Last time the cache entry was touched (via buffer() or get_run()).
    },
}

wtt_map_schema = {
    short_worker_name: (runs_schema, task_id),
}

connections_counter_schema = {
    ip_address: suint,
}

unfinished_runs_schema = {
    run_id,
}

active_runs_schema = {
    "purge_count?": suint,
    run_id: {
        "time": ufloat,
        "lock": threading.RLock,
    },
}

worker_runs_schema = {
    uuid: {
        run_id: True,
        "last_run": run_id,
    }
}
ppigazzini commented 3 months ago

PROD already running with the PR, thank you @vdbergh