Closed minrk closed 1 year ago
I agree that using just the basename is good enough.
I'm not sure about just clang
though. The compilers are not ABI compatible with the system compilers and we really don't want python extensions to be compiled on user machines without the anaconda compilers.
That should probably be brought up with the Python recipe, since Python 3.7 with the new compilers is registering 'gcc'
as the compiler to use to build extensions. FWIW, I'll just say that building extensions with the system compilers (Xcode 10) seems to work fine with Python 3.7 built with the conda compilers, though I don't know what are examples of extensions that would be expected to fail. Does anyone have an example of a combination of compilers where Python built with one and an extension built with another doesn't work?
Some quick thoughts...
Maybe the compilers should be runtime requirements.
Maybe MPICH should be converted into a conda-build
3 style compiler.
cc @sodre (who has experience with the latter in the context of Go)
Does anyone have an example of a combination of compilers where Python built with one and an extension built with another doesn't work?
ABI compatibility is the main issue that we are not releasing packages built with the new compilers to the main label until all (or most of) the packages are built.
For example, if the system compiler was a gcc 4.x series and it was picked up and if a python extension was compiled with it and linked against a C++ library built with newer compilers, it would break. For C libraries it shouldn't matter much, but I'm not an expert on the subject.
Maybe the compilers should be runtime requirements. Maybe MPICH should be converted into a conda-build 3 style compiler.
Not for the mpich
package itself, since compilers are only a dependency for building new libraries, not running packages built with mpi. adding an mpich-mpicc
output for each compiler that depends on the compilers as a true conda-build 3 compiler sounds like a great idea, though.
@isuruf thanks for the note about c++. I was only testing c, where my experience has been that things are extremely insensitive to which compiler other libraries used. I don't have a good understanding of C++ ABI compatibility, but maybe this points to an interesting question - does the right choice actually depend on whether a given package is c++ vs c vs fortran?
For instance, the C library I have the most experience with is libzmq, which is written in C++, but only exports a C API, not a C++ API. What would that mean for compatibility?
I know there are backward-compatibility issues with glibc versions, but not forward-compatibility.
It feels like this issue is probably either addressed or obsolete. @isuruf, @minrk, how do you see the status of this issue? Is it still something that should be addressed?
Some questions about paths and executables with the new compilers, prompted by #24 since MPI includes references to the compiler in the package and calls out to the compilers at runtime.
tl;dr;
Now that we have conda-packaged compilers, there are two questions to answer for what $CC and friends should be:
x86_64-apple-darwin13.4.0-clang
or the 'public' nameclang
(same goes forx86_64-conda_cos6-linux-gnu-cc
)I'm not sure what the answer is for these in default behavior, but I'll illustrate how it affects MPI packages below. For packages that don't preserve references to how they were compiled, I don't believe either of these makes a difference. My inclination is to use the abspath+no host combination.
absolute paths or not?
MPI providers are in a semi-unique situation in that they provide compiler wrappers
mpicc
,mpifort
, etc., For the most part, this involves recording a reference to $CC, $CXX, $FC environment variables in the wrappers themselves. The result is that the values of $CC, etc. at build time for mpi are relevant to downstream packages at runtime.The first hiccup I ran into that prompted was [this one], where builds succeeded but tests failed:
On linux, it failed immediately:
(build dir prefix truncated to
$build
,_placehol...
stripped)while on mac, C compilers succeeded, only fortran failed. This is because on mac,
bug in compiler packages
So the first issue, which I assume is a bug, is that the C compilers on mac use base name, while all other compilers use absolute paths. I assume these should be consistent, but I'm not sure which direction is right. Going with the majority would suggest that they should all use absolute paths. However, if all of the compilers used only their base name, like clang, the build would have succeeded without complaint.
This is failing for mpi packages because the compilers go in the build environment, so absolute paths don't get rewritten. I think the right thing to do for mpi in particular is actually to put compilers in the host environment. In my understanding, 'host' is the right place to put dependencies that the package may refer to, and mpi refers to the compilers that built it. It's not a runtime dependency, but it is often used in conjunction at runtime, so there should perhaps be appropriate restrictions when c compilers are requested in combination with mpi.
Absolute paths:
Conclusion: absolute paths seem like the right thing to use, and packages like mpi that include references to their compilers should signal this to conda by putting the compilers as host dependencies, not build dependencies.
compiler names
Looking at these names also prompted me to think about having versions in the executable names. This leads to the question: should $CC be
darwin13.4.0-clang
or justclang
? I don't know the answer, because I don't actually know what's the source ofdarwin13.4.0
or how/when it can change. It appears to come from conda-build's BUILD env, but I don't really know if/when/how that would be changed. If a change to that value means that a package is definitely incompatible, then the current behavior seems right. If updating that build would not break compatibility, thenclang
is probably the right thing. In general, it seems like recording the 'public' name feels more correct, but I'd have to have a better understanding of what the host string really means in terms of compatibility. E.g. if libmpi is built with clang 4, but then a downstream library is built with clang 6, this is typically fine. The same goes for bumping base macos version - as long as downstream is always newer than upstream, it's typically safe.conclusions
A relevant example: Python uses basename to remove the env prefix from compilers, since it, too, records its compilers in order to build extensions, ultimately only recording
gcc
for compilers, not the env or the host.All that said, I think that just
clang
(base name, no host) might end up getting the best results.cc @conda-forge/core