Open maedoc opened 6 years ago
I forgot to note: the Bernoulli model builds w/o problem.
@wds15 may be able to track this down, as he wrote the threading and sometimes uses Intel compilers.
Given their proprietary nature, not to mention what they do to arithmetic precision in the "fast" settings, we don't officially support them as part of the Stan project. We're happy to get contributions that don't break anything else that help support Intel compilers, but it's not a priority for us to fix.
By the way, this wasn't meant to discourage you from posting issues about Intel compilers. All else being equal, we're happy to support them. And we'd rather hear about issues than not. So thanks for submitting the issue.
Given their proprietary nature, not to mention what they do to arithmetic precision in the "fast" settings
I haven't been able to reproduce numerical issues with the 2018 versions. In any case, I wouldn't be using them if GCC or Clang effectively vectorized Eigen code for AVX512, an Intel-only instruction set. However, this error only appears for threaded map_rect
code, and the CPU in question is 68C/272T, so using GCC with a threaded version of the code (vs Intel single threaded) is a clear win in this case.
I can't promise to follow this up. Honestly, I had a lot of headaches due to Intel compilers which seem to not care too much about numerical accuracy nor are their compilers good with C++ standards. This is why we moved to using gcc (right now version 6.1) with the Intel MKL. This way you get most of the Intel speed bump (MKL), but ensure that you have a compiler which is up to Stan's C++.
I ran into the same issue using StanHeaders 2.19.0 while compiling thurstonianIRT with Intel compilers.
Is there any hope at all to get this resolved, or is using StanHeaders with Intel compilers basically a lost cause?
It's not totally lost, but I am probably the only developer with access to Intel compilers... and I am really not blessed with free time...
Why not just use g++?
(we should still keep track of this, don't get me wrong)
Looks like a bug in ICC. We gave up on ICC since GCC did a good job and time was better spent getting the model to behave better than getting the auto vectorizer to work.
Why not just use g++?
@wds15 We install R with a large collection of R libraries from CRAN with both GCC and Intel compilers, since the Intel compilers often produce better performing binaries.
So far, that hasn't been an issue with any of the R libraries we include (other than trivial compilation errors with Intel compilers that are easy to fix with a trivial patch).
@maedoc That sure looks very similar, but while I can reproduce the issue reported for the example given with Intel C++ compiler version 16.x, I can not reproduce it with more recent Intel C++ compiler versions (17.0.1 works fine, so does the 19.0.1 I'm using now).
Some more info: it seems like this issue was reported to Intel a while ago already, also in the context of stan
, see https://software.intel.com/en-us/forums/intel-c-compiler/topic/781749 .
Summary:
Attempting to link a model w/ threading on Intel compiler produces a partial specialization error.
Description:
I am attempting to test scaling of threaded
map_rect
on a Xeon Phi system, with the following Stan model,I confirmed locally use of multiple CPUs and tried to get it working on a HPC Xeon Phi node, first w/ GCC 5.5.0, which compiles fine ~but uses only one CPU. I assumed GCC doesn't want to thread for the Xeon Phi, so~ (
top
doesn't report CPU usage correctly on Xeon Phi). I tried Intel compilers (icpc version 18.0.2, gcc 5.5.0 compatible), but encounterfull output below.
Reproducible Steps:
Current Output:
Expected Output:
Successful compilation
Additional Information:
I can't provide access to the system but can do some debuggin if any ideas provided.
Current Version:
develop (834df71b2c8f)