Closed Stecors closed 10 months ago
Hi, thanks for reporting. Can you please try the current development version (or binary snapshots from https://github.com/lichess-org/fishnet/actions/runs/7432495919) to see if 9f1a11097cb6ede7ea21e6975bc39e2b690467f9 fixes the issue?
Had the same issue on windows. Seems to be working better with 9f1a110 but experiencing what seems to be lower nodes per sec sitting closer to 4-6k nps from before 2.8 changes from 7-10knps 5800x cpu
I have let 2.8.2-dev run overnight. Even though there were only a handful of fairy-stockfish jobs, I haven't seen any timeouts anymore. Thanks for the quick fix.
Thank you both.
For nps, since it is measured as the nodes of real positions (excluding the newly introduced chunk overlap) divided by the total time taken for the whole batch, a ~20% drop is expected. The degree of parallelism also varies much more, now, so there's more variance in this measurement. We could measure something smoother like nodes per CPU time, but ultimately wall clock time is what's relevant for the user experience.
Since the 2.8.1 update, I have been seeing occasional worker crashes. That did not happen on 2.7.1, which I had been running 24/7 on a server for weeks.
Examples:
2024-01-05 21:49:00 W: Fairy-Stockfish timed out in worker 2. 2024-01-05 21:49:02 W: Fairy-Stockfish timed out in worker 3. 2024-01-05 21:49:02 W: Fairy-Stockfish timed out in worker 0. 2024-01-05 21:49:26 W: Fairy-Stockfish timed out in worker 1.
arch: x86_64-unknown-linux-musl The same error occurs with the new parameter --cpu-priority unchanged.