Open sshivaji opened 10 years ago
That's consistent with the speed I get here: 859587nps vs. 1423863nps. To me, this is acceptable, although a smaller gap would be preferable. If someone finds some way to improve the performance, please let me know.
Thanks for the project!
I find the 30-40% gap odd given the nacl paper benchmarks at http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/34913.pdf. I was expecting the performance loss to be about 5-10%. I guess one can profile and figure out reasons.
There is also https://developer.chrome.com/native-client/faq#how-fast-does-code-run-in-portable-native-client which gives slightly higher overheads than 5-10%, together with the caveat: "[...] whereas very branch-heavy code often performs worse." It is possible that stockfish is hit worse than average in [p]nacl. I just added -O3 to the linker flags, which gives a very slight performance improvement and a significant size reduction.
Thanks for the O3 change. I tried -O4 for both compile and link with a very slighter gain over O3. Do you know if it is easy to perform a PGO build? In the Stockfish tree, make can leverage PGO via "make profile-build ARCH=xxxx"
I have tried searching for this but have come up empty-handed. Clang/LLVM supports PGO, but you'd need a way to get the profile information out of a (p)nacl binary. I currently don't see a way to do that. It might make sense to try to talk to the native client team about this. I am planning to do that after I am able to release a first version of my chrome app that uses pnacl stockfish.
I have this code in an open source app now. Still curious on performance issues, let me know if u are able to resolve them or get them addressed.
Compared to native stockfish, the nacl/pnacl version is at least 30-40% slower. The slowdown was present even after I converted the Makefile to work via nacl instead of pnacl. Why is there such a discrepancy? Hopefully, I am doing something wrong and need to specify another optimization flag.