Closed mstembera closed 6 years ago
We had first replaced '-fprofile-generate' by '-fprofile-arcs' and '-profile-use' by 'fbranch-probabilities' for a 3 to 4% speed up. Later on you found out that with -fno-peel-loops -fno-tracer, the original flags '-fprofile-generate' and '-fprofile-use' can give similar or better results. So it has been committed and is current master. If we again replace it by '-fprofile-arcs' and 'fbranch-probabilities' respectively, gcc PGO builds hassle free but the speed up is lost to some extent. This might be helpful in correct diagnosis and the cure.
For compiling Stockfish with LTO and PGO need use -O2 instead of -O3. I just found it. In addition, even without LTO Stockfish is faster with -O2 on my PC (AMD FX-8150). https://postimg.org/image/kvz1a72jv/
I confirm speedup under Windows with gcc 7.3. I tested with a notebook with an Intel CPU under Ubuntu 16.04, it gave a 0.4% slowdown for O2 against O3. gcc is quite old there, it's gcc 5.4. I am thinking of opening a pull-request for changing to O2.
@CoffeeOne can you test SF -O2 LTO against original SF? In my small tests LTO with -O2 have not advantage.
I just found the offending optimizing option that gets switched on by -O3
, it's -fipa-cp-clone
(which can be switched off by specifying -fno-ipa-cp-clone
after -O3
). The PGO options aren't applied conforming to the g++ documentation either. I'll create a PR later today when I'm through experimenting.
The 'fix' proposed by @gvreuls doesn't solve the problem.
I (gcc 730 windows 10 64 pro) get
Step 2/4. Running benchmark for pgo-build ...
./stockfish bench > /dev/null
terminate called after throwing an instance of 'std::length_error'
what(): basic_string::_M_create
This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
make: *** [Makefile:445: profile-build] Error 3
My new test. Looks like -flto is good with PGO and -O2
PLAYER : RATING ERROR PLAYED (%) W D L D(%)
1 Stockfish_x64_050218_O2_popcnt_730gcc : 3206 4 6000 51.3 1332 3494 1174 58.2 2 Stockfish_x64_050218_O2_popcnt_730mingw : 3199 4 6000 49.9 1254 3474 1272 57.9 3 Stockfish_x64_050218_O3_popcnt_730mingw : 3194 4 6000 48.8 1204 3452 1344 57.5
Individual statistics:
1 Stockfish_x64_050218_O2_popcnt_730gcc : 3206 6000 (+1332,=3494,-1174), 51.3 %
Stockfish_x64_050218_O2_popcnt_730mingw : 3000 (+636,=1758,-606), 50.5 % Stockfish_x64_050218_O3_popcnt_730mingw : 3000 (+696,=1736,-568), 52.1 %
2 Stockfish_x64_050218_O2_popcnt_730mingw : 3199 6000 (+1254,=3474,-1272), 49.9 %
Stockfish_x64_050218_O2_popcnt_730gcc : 3000 (+606,=1758,-636), 49.5 % Stockfish_x64_050218_O3_popcnt_730mingw : 3000 (+648,=1716,-636), 50.2 %
3 Stockfish_x64_050218_O3_popcnt_730mingw : 3194 6000 (+1204,=3452,-1344), 48.8 %
Stockfish_x64_050218_O2_popcnt_730gcc : 3000 (+568,=1736,-696), 47.9 % Stockfish_x64_050218_O2_popcnt_730mingw : 3000 (+636,=1716,-648), 49.8 %
Please note that the proposed fix does work if you use the right options, see PR #1404
Okay, so the PR was closed because of all the clueless nonsense you posted @ChessMan3 thank you very much. So much for helping Windows users.
@snicolet
I must say, I do not understand, why you closed the pull-request, which solved this problem. For the prestige event TCEC only Windows performance counts. So somebody will create for TCEC a binary with enabling lto and -fno-ipa-cp-clone
. OK, it's an option, too.
@CoffeeOne I was so disappointed and angry I deleted the branch, so I don't think the PR can be reopened. When things calm down in a day or so and there still is interest in the patch I'll push it to github again and do a new PR. Having a working fix but being unable to make the people you're trying to help understand really got to me.
Yes, sleeping over it and starting afresh tomorrow is the wise thing to do, in my opinion.
I shall again suggest to start a thread in the forum for discussing the bug and converging to a consensus solution before opening a PR.
Somehow, MakeFiles are a little bit like regular expressions, everybody loves them and everybody has always very strong opinions about them, and like regular expression it can become extremely tricky to get them right, especially because our developing systems are surprisingly different. Marco and other senior developers have a lot of experience there, I am sure they will help.
Hoping this will resolves pacifically, Stéphane
I've started a thread on fishcooking. Let's hope this leads to something more positve than yesterday's try (also hoping we don't decide to influence/degrade other builds just to accomodate Windows users).
https://groups.google.com/forum/#!topic/fishcooking/SdRoxlbK5Wo
Thanks for opening that thread over there! The forum is indeed a better tool for this kind of problems, with less ping-pong messages between only two people and a slower-paced discussion -- especially if each forum contribution tries to stay constructive and summarize the discussion so far, which takes time :-)
Can you guys confirm that Ronald's solution fixes the problem in this issue? Ronald's pull request is there: https://github.com/official-stockfish/Stockfish/pull/1408
@snicolet I confirm this fixes the issue. @syzygy1 Thank you!
Closed because of the fix in https://github.com/official-stockfish/Stockfish/commit/860223c5e6b2df53af0415dbbbfdd4834f083708
make profile-build ARCH=x86-64-modern COMP=gcc fails under Windows with:
The issue started with this https://github.com/official-stockfish/Stockfish/commit/444d99b6d24ba823c8c1ed7a94f3c15b0a536024 "Rewrite benchmark" commit and has already been reported here https://groups.google.com/forum/#!topic/fishcooking/p2ypy3i0o7I https://groups.google.com/forum/#!msg/fishcooking/KaXqqnP21fI/YuvUjlUxBgAJ
It only shows up with the use of the -fprofile-generate AND -flto flags together. It is not obvious to me from looking at the code changes what may be responsible.