Classical Evaluation Improved, but Search is no longer "Tuned" for it. [Regression on Classical-only Search]

BM123499 commented 3 years ago

First of all, I came here in good faith to talk about a whole section of Stockfish, Classical Evaluation, and I hope we have a clean discussion about it us all. NNUE has been introduced in Stockfish and improved it a lot, it's undeniable and I not here to speak about how much it improved, but how our following commits may have weakened SF_classical. SF_NNUE and SF_classical have different kind of evaluation and they have different speed. It's not difficult to see that our tunings and new logics in search.cpp are tuned for hybrid evaluation, so, for other types of evaluation, they may cause a regression. Of course, for most people, the hybrid evaluation is a way to go. But, for those that for any reason can't (or decided not to) use NNUE, Stockfish 12 (or further) may cause regression on its HCE analyses.

And thanks so far for reading this and hope we improve SF even further.

Reference to SF_classical

Tests, all using HCE Search: SF_13 vs SF_12 SF_13 vs SF_classical SF13_dev vs SF_classical

Edited message

BM123499 commented 3 years ago

I also don't know everything. My point of view is based entirely on those tests. If anyone has a contra-argument (or many), feel free to share.

ddobbelaere commented 3 years ago

Thanks for bringing this up. The fact that local analysis in lichess uses no NNUE is an eye opener for me, I was unaware of this. Let's hope someone finds the root cause of the slowdown, as the linked article seems to hint at an unknown issue. If SF NNUE would be an unconditional gain on all platforms (including web browser), then the situation would be more clear cut IMO.

scchess commented 3 years ago

NNUE can't be used in browser based analysis because it would require a download of a big network file. A website like lichess would have been pulled down for practically no benefits.

vondele commented 3 years ago

I think they're definitely looking into having NNUE in the browser... if we have any WASM experts, that's a PR to study https://github.com/niklasf/stockfish.wasm/pull/30

vondele commented 3 years ago

actually just moved here : https://github.com/hi-ogawa/Stockfish

hi-ogawa commented 3 years ago

Hi guys. By any means, I'm not WASM nor SIMD export, but I experimented with WASM SIMD backend for NNUE on that repository. Just commenting from my experiment, at this point, at least on x86, the performance of NNUE evaluation (or essentially 32x512 affine transform) is roughly comparable to SSE2 native build. The reason why it cannot reach SSE3 is that, currently WASM SIMD only supports _mm_madd_epi16 equivalent, but not _mm_maddubs_epi16. The good news is that WASM SIMD spec writers are still adding new instructions and maybe they will add something like _mm_maddubs_epi16 (see for example https://github.com/WebAssembly/simd/pull/382). As to my implementation of affine transform for WASM SIMD, you can find it here and I appreciate any critique https://github.com/hi-ogawa/Stockfish/blob/emscripten/src/emscripten/wasm_simd.cpp.

Anyways, my point is that, if Stockfish NNUE (I mean, hybrid) SSE2 build is stronger than Stockfish classical mode, then the same should be more-or-less true for browser/nodejs environment with WASM SIMD.

PS. By the way, this is my first time commenting here on Stockfish repo and just wanna say big thanks to this project and all the people contributing here. I've just recently started chess programming (or actually chess itself), but seeing the strongest engine free for all is so amazing and inspired me a lot!

BM123499 commented 3 years ago

@hi-ogawa Thanks for being here and talk to us. I have no experience in WASM or any web work. But, can you answer me some questions or maybe direct someone to answer them? The questions are the following:

Is it viable to have local NNUE analysis on websites?
If true, how long should we expect this to happen?
Will local analysis be removed to a cloud one soon?
What do expect on the future of local web analysis?
Is it a problem if the engine is slightly downgrade on websites?
Do websites care if their engines is a slightly better than the previous one?

And of course, all considering Classical Evaluation and/or NNUE on website.

hi-ogawa commented 3 years ago

@BM123499 Those are very interesting questions and indeed I'm interested in what lichess developers would answer them. From my perspective, essentially I did my experiment in order to answer the first question:

Is it viable to have local NNUE analysis on websites?

and I think I found the answer, which is yes.

For the second question,

If true, how long should we expect this to happen?

Technically speaking, the latest Chrome and Firefox (with some option enabled, see here for the detail https://github.com/hi-ogawa/Stockfish/wiki) should be able to run WASM SIMD port of Stockfish NNUE. For example, you can try uci command directly on my very primitive frontend here https://stockfish-nnue-js.vercel.app (note that it will download compressed 10MB net, so a bit careful if you're on weak connection). For this question, if you're specifically talking about when lichess is going to make it available, then I'm not sure. I think we're still in the process of testing and integration.

For the last four questions, I cannot really answer them because they are about what's good for a website and service. So, maybe you can ask directly on lichess discord https://discord.gg/hy5jqSs or github issue https://github.com/ornicar/lila since you're interested in the website's feature.

BM123499 commented 3 years ago

It seems that Lichess is aware of the regression on Classical Evaluation. Also that they will probably introduce SF_NNUE on local analysis soon. I'll leave this open for a while to see if anyone comes up with some ideas or criticisms.

BM123499 commented 3 years ago

In addition, I remove all NNUE-related commits from SF_13 on this test: https://tests.stockfishchess.org/tests/view/603307af7f517a561bc4a026

It could be a reference for someone. I just hope it improves.

BM123499 commented 3 years ago

Thanks everyone involved in this issue, Lichess chose to used the best suited Stockfish version (SF_classical) on its HCE analyses, and I believe other websites will change as well.

I'm closing this issue and I hope it still a reference for the future.

Dantist commented 3 years ago

@BM123499 Some similar tests were performed back in Dec 2020 (https://groups.google.com/g/fishcooking/c/fvWkshbDgMc).

At that time, it turned out that:

Master branch with "NNUE commits removed" is 24 ELO stronger than pre-NNUE SF commit. (https://tests.stockfishchess.org/tests/view/5fe199dc3932f79192d3958b)
Master branch "with classical patches removed":
- Is not better than a master in the NNUE mode. So, these patches, probably, don't bring regression for NNUE. (https://tests.stockfishchess.org/tests/view/5fe0d8ae3932f79192d39545)
- Is 4.7 ELO weaker than a master in the Classical mode. So, these patches, probably, bring ELO for classical. (https://tests.stockfishchess.org/tests/view/5fe0f82c3932f79192d3955a)

So there is a codebase that is stronger than SF_classical in "Classic Mode", but maintaining multiple branches with appropriate testing is quite difficult, and probably impractical.

official-stockfish / Stockfish

Classical Evaluation Improved, but Search is no longer "Tuned" for it. [Regression on Classical-only Search] #3365