Closed toni-chess closed 3 months ago
Your testing methodology is not very accurate. I assume you got -15 but some huge +- since you have only played 22 games against a pool of engines and in total only 1040 games. I would recommend heavily reducing the number of engines you play against and just increasing the games played per engine. You could follow what CEGT does, which is still inaccurate, but a bit better, since at least they play only vs engines in the tested engine's range.
Also this Clover 6.2 scales more at longer TC, but 09s+0.1s is very short.
TLDR: improve your testing methodology.
In My test Clover 6.2 x64(AVX2) is 15 Elo weaker than Clover 6.1 x64(AVX2)!