jordanbray / chess

A rust library to manage chess move generation
https://jordanbray.github.io/chess/
MIT License
234 stars 54 forks source link

LTO currently required for reasonable performance - Some functions need #[inline] #20

Closed s-arash closed 5 years ago

s-arash commented 5 years ago

Does marking the functions with the #[inline] attribute help with performance? From what I understand, the compiler does not inline functions from external crates unless they are marked with #[inline], and I feel like inlining many of the functions in the library can have a noticeable impact on performance. I'm thinking mostly of the functions in magic.rs, and Board and BitBoard functions.

jordanbray commented 5 years ago

Are you asking if I should add #[inline] to those functions? BitBoard in particular must be inlined for performance reasons, so if the compiler doesn't inline those, it very much needs to be fixed.

s-arash commented 5 years ago

Yes. I think inlining those functions improves performance. But I don't have any empirical data to back that up.

jordanbray commented 5 years ago

I'll need to check. In theory, enabling "LTO" in the Cargo.toml should have the same effect. https://doc.rust-lang.org/cargo/reference/manifest.html#the-profile-sections

s-arash commented 5 years ago

You seem to be right. I enabled lto and it improved my eval function's perf by > 3x. It may still be a good idea to add #[inline] to perf critical functions.

jordanbray commented 5 years ago

Yeah. I should definitely go through and inline some of the functions in that case. That said, for your application, using LTO is probably best because it will inline all crates, not just mine. Which brings me to my next point, which is, I wonder what functions need to be inlined. I know the BitBoard stuff does, but I'm curious what else. Is this eval function open source anywhere? If not, would you mind:

s-arash commented 5 years ago

Unfortunately, my toy chess engine is not open source. But I'll do these steps and post the results here.

jordanbray commented 5 years ago

Thanks. I appreciate it

s-arash commented 5 years ago

callgrind.out.lto.txt callgrind.out.nolto.txt

I'm not familiar with valgrind, so I just did what you asked. If you need me to invoke the tool any differently, just let me know.

jordanbray commented 5 years ago

Sorry for the delay. This is resolved in 3.1.1. Feel free to open the bug again if the performance is still not where it needs to be.