conda-forge / conda-forge.github.io

The conda-forge website.
https://conda-forge.org
BSD 3-Clause "New" or "Revised" License
130 stars 274 forks source link

Migration to conda-build 3 and new compilers from Anaconda #460

Closed msarahan closed 5 years ago

msarahan commented 7 years ago

Mega-thread for discussion and issue tracking with migration.

Currently known things to address:

Added 10/18/2017:

Lots more I'm probably missing here, so please comment and/or modify this issue.

ocefpaf commented 7 years ago

1 and 3 are kind of related and I would love to hear @isuruf's thoughts on those.

2 is, IMO, a no brainier. We kill the current scheme, serve a canonical conda_build_config.yaml file somewhere (maybe the conda build setup will be useful for this) and, in some rare cases, use a local file to override the default one.

It kind of solves the bot issue we have at the moment but we would need to trigger rebuilds when conda_build_config.yaml is updated. Or just let things "break" and re-build as needed (not so different from what happens right now with the dead bot).

A long term solution would be a service that rebuilds stuff following the dependency tree. One of my ideas for the Google Summer of Code 2018 BTW.

I don't have an option on 4 besides outsourcing as much as we can from conda-build-all to conda-build.

isuruf commented 7 years ago

how does toolchain package translate to compiler activation flags with new compilers?

I think we can remove toolchain. Do the new compilers have flags like -I${PREFIX}/include and -Wl,-rpath,${PREFIX}/lib ?

msarahan commented 7 years ago

I don't think those are defaults. I have needed to add them to some recipes. @mingwandroid, can you tell us why they are not defaults? Are they unsafe in some cases?

mingwandroid commented 7 years ago

I would rather not add -I${PREFIX}/include or -L${PREFIX}/lib because that un-generalizes the compilers.

mingwandroid commented 7 years ago

Same for -Wl,-rpath,... these things should be added when needed.

isuruf commented 7 years ago

I would rather not add -I${PREFIX}/include or -L${PREFIX}/lib because that un-generalizes the compilers.

I always think of $PREFIX as similar to /usr where the system compiler gcc checks for headers and libraries. In that case it makes sense for the gcc compiler in a conda-build environment to check in $PREFIX. This doesn't have to be in a normal conda environment, but can be done in a conda-build environment, just like we disable setuptools downloading in a conda-build environment. (I don't know anything about the new compilers, so I don't have any idea how this can be done)

mingwandroid commented 7 years ago

Adding a check for $CONDA_BUILD would be 'ok' in my opinion, still I do not like passing flags to builds that do not need them and serve no purpose.

msarahan commented 7 years ago

I would like to add a check for $CONDA_BUILD and add these flags when true. When builds fail due to missing this, it is an annoying hurdle for people to learn about. I will try to add this into conda-build today.

jakirkham commented 7 years ago

By adding into conda-build, do you mean setting environment variables for header and library search paths? If so, strong +1 from me. 😄

msarahan commented 7 years ago

In the activate.sh scripts provided by the compiler wrapper packages, gcc_linux-64 & clang_osx-64

if [ "$CONDA_BUILD" == "1" ]; then
    CFLAGS="$CFLAGS -I$PREFIX/include"
    LDFLAGS="$LDFLAGS -L$PREFIX/lib -Wl,-rpath,$PREFIX/lib"
fi

CXXFLAGS would be set equivalently in activate scripts for the CXX packages.

(open to suggestions on reordering arguments)

msarahan commented 7 years ago

So, I guess to clarify, this isn't going to be set by conda-build, per se. Conda-build will set CONDA_BUILD, and then the compiler activate scripts will behave a little differently to add these flags.

mingwandroid commented 7 years ago

A good example of where -I${PREFIX}/include and -L${PREFIX}/lib does cause real trouble is when you have a dependency on a package that depends on libiconv on linux. This package should not exist on linux since glibc implements iconv and libiconv was extracted from it.

Adding those flags causes an undeclared, unnecessary and incorrect dependency to creep in to your build. If the package that did depend on libiconv is later corrected so that it doesn't depend on libiconv, you end up with a broken package that links to iconv symbols it never should've linked to (and worse, has a DT_NEEDED entry of libiconv.so).

So I'm still reluctantly saying 'ok', but it's really wrong IMHO.

msarahan commented 7 years ago

Yeah, too bad there's no completely clear win here. I think the benefits of adding it outweigh the costs, but there also needs to be a way to disable it.

mingwandroid commented 7 years ago

Maybe we could delay such a change until conda-build/pyldd is equipped to warn (or better, error) on undeclared dependencies creeping into DT_NEEDED?

FWIW, to me it feels like we add those include and link FLAGS to far fewer than half of the recipes in AD5 (but I haven't checked).

msarahan commented 7 years ago

Half of the recipes in AD5 is a lot. If this trips up even 1/100 of the recipes in conda-forge, it's a big waste of people's time. I'd much rather fix the few recipes that this does affect negatively (these seem to be at the bottom of the stack), and have stuff further up the stack "just work"

isuruf commented 7 years ago

There are 198 recipes in conda-forge adding -L${PREFIX}/lib which is like 5% of all the recipes or 25% of the recipes with compilation.

ocefpaf commented 7 years ago

A good example of where -I${PREFIX}/include and -L${PREFIX}/lib does cause real trouble is when you have a dependency on a package that depends on libiconv on linux. This package should not exist on linux since glibc implements iconv and libiconv was extracted from it.

@mingwandroid it does look like the libiconv case is a pathological one. (BTW I did not know that, we added libiconv everywhere to satisfy some old scientific software and maybe we should revisit its need.)

However, I am more on @isuruf's side. It does looks natural to conda do add its PREFIX by default. With that said, I kind of got used to add that manually b/c that is what we've been doing since day one. So I am +0.5.

isuruf commented 7 years ago

CXXFLAGS would be set equivalently in activate scripts for the CXX packages.

At toolchain, we are setting CPPFLAGS also

isuruf commented 7 years ago

Pinning is managed with conda_build_config.yaml files in conda-build 3. How to migrate conda-forge's existing pinning scheme to that?

Can conda-build 3 do what run_exports is doing using a config.yaml file? If so, it'll be easier to migrate. From that I mean, we pin in the build section and not run where the runtime pinning is based on the buildtime pinning. That way we can update pins in recipes.

Other option would be to remove pinning in recipes and have config.yaml do that. Some issues I have with it are,

  1. If there is a dependency that doesn't need to be pinned what happens? For eg: we pin boost-cpp, but if a package needs only the boost headers, but not the libraries, what happens?
  2. When we update a pinning, right now we look at the individual recipes and update them which is done manually or through a bot. If we switched to a config.yaml, we'll have no idea which recipe needs to be rebuilt or not. (For Anaconda, this is less of an issue because from what I understand, you rebuild everything by looking at dependencies when a pinning change happens, but conda-forge is limited by CI)
notestaff commented 7 years ago

"-I${PREFIX}/include and -L${PREFIX}/lib does cause real trouble is when you have a dependency on a package that depends on libiconv on linux" -- could the lint checker catch such errors?

msarahan commented 7 years ago

For anyone who wants an example of run_exports and centralized pinning, I've been working on libgdal. here's the recipe: https://github.com/AnacondaRecipes/libgdal-feedstock/blob/master/recipe/meta.yaml

Note that there's only one run requirement. The built package gets everything else from its build dependencies' run_exports. The output package metadata looks like:

{
  "arch": "x86_64",
  "build": "h28942fd_0",
  "build_number": 0,
  "depends": [
    "curl",
    "expat >=2.2.4,<3.0a0",
    "freexl >=1.0.4,<2.0a0",
    "geos >=3.6.0,<3.6.1.0a0",
    "giflib >=5.1.4,<5.2.0a0",
    "hdf4 >=4.2.13,<4.2.14.0a0",
    "hdf5 >=1.10.1,<1.10.2.0a0",
    "jpeg >=9b,<10a",
    "kealib >=1.4.7,<1.5.0a0",
    "libdap4 >=3.19.0,<3.20.0a0",
    "libkml >=1.3.0,<1.4.0a0",
    "libnetcdf >=4.4.1.1,<4.4.2.0a0",
    "libpng >=1.6.32,<1.7.0a0",
    "libpq >=9.6.5",
    "libspatialite >=4.3.0a,<4.4.0a0",
    "libtiff >=4.0.8,<5.0a0",
    "openjpeg >=2.2.0,<3.0a0",
    "poppler >=0.60.1,<1.0a0",
    "proj4 >=4.9.3,<4.9.4.0a0",
    "sqlite >=3.20.1,<4.0a0",
    "xerces-c >=3.2.0,<3.3.0a0",
    "xz >=5.2.3,<6.0a0",
    "zlib >=1.2.11,<1.3.0a0"
  ],
  "license": "MIT",
  "name": "libgdal",
  "platform": "osx",
  "subdir": "osx-64",
  "timestamp": 1508362884739,
  "version": "2.2.2"
}

The actual versions of those inputs come from https://github.com/AnacondaRecipes/aggregate/blob/master/conda_build_config.yaml

Note that there are 2 hdf5 versions in that conda_build_config.yaml file - building this libgdal recipe results in 2 output packages - one built for each hdf5 version.

isuruf commented 7 years ago

@msarahan, thanks. That's great. I have two questions.

  1. Is it possible to ignore run_exports?. For example a python package which loads a dynamic library using cffi don't need any pinning.

  2. Is it possible to get this output metadata without actually building the package? For the maintenance bot, what we could do is get the output metadata from the latest built package and compare the output metadata if the package was built right now and if not trigger a rebuild.

msarahan commented 7 years ago
  1. Yes (from https://conda.io/docs/user-guide/tasks/build-packages/define-metadata.html#pin-downstream):

The potential downside of this feature is that it takes some control over constraints away from downstream users. If an upstream package has a problematic run_exports constraint, you can ignore it in your recipe by listing the upstream package name in the build/ignore_run_exports section:

build:
  ignore_run_exports:
    - libstdc++
  1. Yes, conda render takes all run_exports information into account. Here's the outputs from libgdal (note 2 recipes, due to the 2 hdf5 variants):
$ conda render libgdal-feedstock
package:
    name: libgdal
    version: 2.2.2
source:
    patches:
        - disable_jpeg12.patch
        - windowshdf5.patch
        - multiprocessor.patch
    sha256: 14c1f78a60f429ad51c08d75cbf49771f1e6b20e7385c6e8379b40e8dfa39544
    url: http://download.osgeo.org/gdal/2.2.2/gdal-2.2.2.tar.gz
build:
    number: '0'
    run_exports:
        - libgdal >=2.2.2,<2.3.0a0
requirements:
    build:
        - glib 2.53.6 ha08cb78_1
        - ld64 274.2 h7c2db76_0
        - sqlite 3.20.1 h900c3b0_1
        - openssl 1.0.2l h6cc03a0_4
        - expat 2.2.4 hf840079_2
        - geos 3.6.0 hd02dd01_1
        - gettext 0.19.8.1 hb0f4f8b_2
        - icu 58.2 hea21ae5_0
        - jpeg 9b haccd157_1
        - hdf5 1.8.18 h1edb405_0
        - openjpeg 2.2.0 hede6aa5_1
        - zlib 1.2.11 h60db283_1
        - postgresql 9.6.5 hf30a1d1_0
        - libspatialite 4.3.0a h3ada6ff_17
        - pcre 8.41 h29eefc5_0
        - freexl 1.0.4 hc15b6b0_3
        - giflib 5.1.4 h6b45fa3_1
        - clang 4.0.1 h662ec87_0
        - xz 5.2.3 h808ee3b_1
        - pixman 0.34.0 h33cc98f_2
        - ca-certificates 2017.08.26 ha1e5d58_0
        - bzip2 1.0.6 h92991f9_1
        - proj4 4.9.3 h8d5df23_6
        - hdf4 4.2.13 h0cfac3f_1
        - curl 7.55.1 h0601bc4_3
        - freetype 2.8 h143eb01_0
        - libdap4 3.19.0 h57690e0_1
        - cctools 895 h7512d6f_0
        - libcxx 4.0.1 h579ed51_0
        - libpng 1.6.32 hce72d48_2
        - poppler 0.60.1 h25db2b2_0
        - libffi 3.2.1 hd939716_3
        - libcxxabi 4.0.1 hebd6815_0
        - cairo 1.14.10 h842a265_5
        - clang_osx-64 4.0.1 hc6fb1d2_0
        - compiler-rt 4.0.1 h5487866_0
        - libxml2 2.9.4 hbd0960b_5
        - libedit 3.1 hb4e282d_0
        - libnetcdf 4.4.1.1 h185d453_7
        - ncurses 6.0 ha932d30_1
        - xerces-c 3.2.0 h4094662_0
        - json-c 0.12.1 h7fa5a77_1
        - llvm 4.0.1 hc748206_0
        - kealib 1.4.7 h20aded0_4
        - libboost 1.65.1 h0af5023_1
        - libkml 1.3.0 he0c3ce7_2
        - libtiff 4.0.8 h8cd0352_9
        - poppler-data 0.4.8 hf2eda46_0
        - libgfortran 3.0.1 h93005f0_2
        - libssh2 1.8.0 h1218725_2
        - fontconfig 2.12.4 h30cff1d_2
        - libiconv 1.15 he69fb81_6
        - llvm-lto-tapi 4.0.1 h6701bc3_0
    host:
        - glib 2.53.6 ha08cb78_1
        - sqlite 3.20.1 h900c3b0_1
        - openssl 1.0.2l h6cc03a0_4
        - expat 2.2.4 hf840079_2
        - geos 3.6.0 hd02dd01_1
        - gettext 0.19.8.1 hb0f4f8b_2
        - icu 58.2 hea21ae5_0
        - jpeg 9b haccd157_1
        - hdf5 1.8.18 h1edb405_0
        - openjpeg 2.2.0 hede6aa5_1
        - zlib 1.2.11 h60db283_1
        - postgresql 9.6.5 hf30a1d1_0
        - libspatialite 4.3.0a h3ada6ff_17
        - pcre 8.41 h29eefc5_0
        - freexl 1.0.4 hc15b6b0_3
        - giflib 5.1.4 h6b45fa3_1
        - xz 5.2.3 h808ee3b_1
        - pixman 0.34.0 h33cc98f_2
        - ca-certificates 2017.08.26 ha1e5d58_0
        - bzip2 1.0.6 h92991f9_1
        - proj4 4.9.3 h8d5df23_6
        - hdf4 4.2.13 h0cfac3f_1
        - curl 7.55.1 h0601bc4_3
        - freetype 2.8 h143eb01_0
        - libdap4 3.19.0 h57690e0_1
        - libcxx 4.0.1 h579ed51_0
        - libpng 1.6.32 hce72d48_2
        - poppler 0.60.1 h25db2b2_0
        - libffi 3.2.1 hd939716_3
        - libcxxabi 4.0.1 hebd6815_0
        - cairo 1.14.10 h842a265_5
        - libxml2 2.9.4 hbd0960b_5
        - libedit 3.1 hb4e282d_0
        - libnetcdf 4.4.1.1 h185d453_7
        - ncurses 6.0 ha932d30_1
        - xerces-c 3.2.0 h4094662_0
        - json-c 0.12.1 h7fa5a77_1
        - kealib 1.4.7 h20aded0_4
        - libboost 1.65.1 h0af5023_1
        - libkml 1.3.0 he0c3ce7_2
        - libtiff 4.0.8 h8cd0352_9
        - poppler-data 0.4.8 hf2eda46_0
        - libgfortran 3.0.1 h93005f0_2
        - libssh2 1.8.0 h1218725_2
        - fontconfig 2.12.4 h30cff1d_2
        - libiconv 1.15 he69fb81_6
    run:
        - giflib >=5.1.4,<5.2.0a0
        - libtiff >=4.0.8,<5.0a0
        - kealib >=1.4.7,<1.5.0a0
        - jpeg >=9b,<10a
        - sqlite >=3.20.1,<4.0a0
        - hdf4 >=4.2.13,<4.2.14.0a0
        - libpng >=1.6.32,<1.7.0a0
        - libdap4 >=3.19.0,<3.20.0a0
        - proj4 >=4.9.3,<4.9.4.0a0
        - poppler >=0.60.1,<1.0a0
        - expat >=2.2.4,<3.0a0
        - curl
        - libpq  >=9.6.5
        - zlib >=1.2.11,<1.3.0a0
        - hdf5 >=1.8.18,<1.8.19.0a0
        - libspatialite >=4.3.0a,<4.4.0a0
        - geos >=3.6.0,<3.6.1.0a0
        - openjpeg >=2.2.0,<3.0a0
        - libnetcdf >=4.4.1.1,<4.4.2.0a0
        - freexl >=1.0.4,<2.0a0
        - xerces-c >=3.2.0,<3.3.0a0
        - libkml >=1.3.0,<1.4.0a0
        - xz >=5.2.3,<6.0a0
test:
    commands:
        - gdal_grid --version
        - gdal_rasterize --version
        - gdal_translate --version
        - gdaladdo --version
        - gdalenhance --version
        - gdalwarp --version
        - gdalinfo --formats
        - gdalinfo http://thredds.nersc.no/thredds/dodsC/greenpath/Model/topaz
        - conda inspect linkages -p $PREFIX libgdal
        - conda inspect objects -p $PREFIX libgdal
    files:
        - test_data
about:
    home: http://www.gdal.org/
    license: MIT
    license_file: LICENSE.TXT
    summary: The Geospatial Data Abstraction Library (GDAL).
extra:
    copy_test_source_files: true
    final: true
    recipe-maintainers:
        - ocefpaf
        - kmuehlbauer
        - gillins
        - msarahan

package:
    name: libgdal
    version: 2.2.2
source:
    patches:
        - disable_jpeg12.patch
        - windowshdf5.patch
        - multiprocessor.patch
    sha256: 14c1f78a60f429ad51c08d75cbf49771f1e6b20e7385c6e8379b40e8dfa39544
    url: http://download.osgeo.org/gdal/2.2.2/gdal-2.2.2.tar.gz
build:
    number: '0'
    run_exports:
        - libgdal >=2.2.2,<2.3.0a0
requirements:
    build:
        - glib 2.53.6 ha08cb78_1
        - ld64 274.2 h7c2db76_0
        - sqlite 3.20.1 h900c3b0_1
        - openssl 1.0.2l h6cc03a0_4
        - expat 2.2.4 hf840079_2
        - geos 3.6.0 hd02dd01_1
        - gettext 0.19.8.1 hb0f4f8b_2
        - icu 58.2 hea21ae5_0
        - jpeg 9b haccd157_1
        - openjpeg 2.2.0 hede6aa5_1
        - libspatialite 4.3.0a h3ada6ff_17
        - zlib 1.2.11 h60db283_1
        - postgresql 9.6.5 hf30a1d1_0
        - pcre 8.41 h29eefc5_0
        - freexl 1.0.4 hc15b6b0_3
        - giflib 5.1.4 h6b45fa3_1
        - clang 4.0.1 h662ec87_0
        - xz 5.2.3 h808ee3b_1
        - pixman 0.34.0 h33cc98f_2
        - ca-certificates 2017.08.26 ha1e5d58_0
        - bzip2 1.0.6 h92991f9_1
        - proj4 4.9.3 h8d5df23_6
        - hdf4 4.2.13 h0cfac3f_1
        - curl 7.55.1 h0601bc4_3
        - freetype 2.8 h143eb01_0
        - libnetcdf 4.4.1.1 h651529c_7
        - libdap4 3.19.0 h57690e0_1
        - cctools 895 h7512d6f_0
        - libcxx 4.0.1 h579ed51_0
        - libpng 1.6.32 hce72d48_2
        - poppler 0.60.1 h25db2b2_0
        - libffi 3.2.1 hd939716_3
        - libcxxabi 4.0.1 hebd6815_0
        - cairo 1.14.10 h842a265_5
        - clang_osx-64 4.0.1 hc6fb1d2_0
        - compiler-rt 4.0.1 h5487866_0
        - libxml2 2.9.4 hbd0960b_5
        - libedit 3.1 hb4e282d_0
        - kealib 1.4.7 h73e85dd_4
        - ncurses 6.0 ha932d30_1
        - xerces-c 3.2.0 h4094662_0
        - json-c 0.12.1 h7fa5a77_1
        - llvm 4.0.1 hc748206_0
        - hdf5 1.10.1 h6090a45_0
        - libboost 1.65.1 h0af5023_1
        - libkml 1.3.0 he0c3ce7_2
        - libtiff 4.0.8 h8cd0352_9
        - poppler-data 0.4.8 hf2eda46_0
        - libgfortran 3.0.1 h93005f0_2
        - libssh2 1.8.0 h1218725_2
        - fontconfig 2.12.4 h30cff1d_2
        - libiconv 1.15 he69fb81_6
        - llvm-lto-tapi 4.0.1 h6701bc3_0
    host:
        - glib 2.53.6 ha08cb78_1
        - sqlite 3.20.1 h900c3b0_1
        - openssl 1.0.2l h6cc03a0_4
        - expat 2.2.4 hf840079_2
        - geos 3.6.0 hd02dd01_1
        - gettext 0.19.8.1 hb0f4f8b_2
        - icu 58.2 hea21ae5_0
        - jpeg 9b haccd157_1
        - openjpeg 2.2.0 hede6aa5_1
        - libspatialite 4.3.0a h3ada6ff_17
        - zlib 1.2.11 h60db283_1
        - postgresql 9.6.5 hf30a1d1_0
        - pcre 8.41 h29eefc5_0
        - freexl 1.0.4 hc15b6b0_3
        - giflib 5.1.4 h6b45fa3_1
        - xz 5.2.3 h808ee3b_1
        - pixman 0.34.0 h33cc98f_2
        - ca-certificates 2017.08.26 ha1e5d58_0
        - bzip2 1.0.6 h92991f9_1
        - proj4 4.9.3 h8d5df23_6
        - hdf4 4.2.13 h0cfac3f_1
        - curl 7.55.1 h0601bc4_3
        - freetype 2.8 h143eb01_0
        - libnetcdf 4.4.1.1 h651529c_7
        - libdap4 3.19.0 h57690e0_1
        - libcxx 4.0.1 h579ed51_0
        - libpng 1.6.32 hce72d48_2
        - poppler 0.60.1 h25db2b2_0
        - libffi 3.2.1 hd939716_3
        - libcxxabi 4.0.1 hebd6815_0
        - cairo 1.14.10 h842a265_5
        - libxml2 2.9.4 hbd0960b_5
        - libedit 3.1 hb4e282d_0
        - kealib 1.4.7 h73e85dd_4
        - ncurses 6.0 ha932d30_1
        - xerces-c 3.2.0 h4094662_0
        - json-c 0.12.1 h7fa5a77_1
        - hdf5 1.10.1 h6090a45_0
        - libboost 1.65.1 h0af5023_1
        - libkml 1.3.0 he0c3ce7_2
        - libtiff 4.0.8 h8cd0352_9
        - poppler-data 0.4.8 hf2eda46_0
        - libgfortran 3.0.1 h93005f0_2
        - libssh2 1.8.0 h1218725_2
        - fontconfig 2.12.4 h30cff1d_2
        - libiconv 1.15 he69fb81_6
    run:
        - giflib >=5.1.4,<5.2.0a0
        - libtiff >=4.0.8,<5.0a0
        - kealib >=1.4.7,<1.5.0a0
        - jpeg >=9b,<10a
        - sqlite >=3.20.1,<4.0a0
        - hdf4 >=4.2.13,<4.2.14.0a0
        - libpng >=1.6.32,<1.7.0a0
        - libdap4 >=3.19.0,<3.20.0a0
        - proj4 >=4.9.3,<4.9.4.0a0
        - poppler >=0.60.1,<1.0a0
        - expat >=2.2.4,<3.0a0
        - hdf5 >=1.10.1,<1.10.2.0a0
        - curl
        - libpq  >=9.6.5
        - zlib >=1.2.11,<1.3.0a0
        - libspatialite >=4.3.0a,<4.4.0a0
        - geos >=3.6.0,<3.6.1.0a0
        - openjpeg >=2.2.0,<3.0a0
        - libnetcdf >=4.4.1.1,<4.4.2.0a0
        - freexl >=1.0.4,<2.0a0
        - xerces-c >=3.2.0,<3.3.0a0
        - libkml >=1.3.0,<1.4.0a0
        - xz >=5.2.3,<6.0a0
test:
    commands:
        - gdal_grid --version
        - gdal_rasterize --version
        - gdal_translate --version
        - gdaladdo --version
        - gdalenhance --version
        - gdalwarp --version
        - gdalinfo --formats
        - gdalinfo http://thredds.nersc.no/thredds/dodsC/greenpath/Model/topaz
        - conda inspect linkages -p $PREFIX libgdal
        - conda inspect objects -p $PREFIX libgdal
    files:
        - test_data
about:
    home: http://www.gdal.org/
    license: MIT
    license_file: LICENSE.TXT
    summary: The Geospatial Data Abstraction Library (GDAL).
extra:
    copy_test_source_files: true
    final: true
    recipe-maintainers:
        - ocefpaf
        - kmuehlbauer
        - gillins
        - msarahan
msarahan commented 7 years ago

As for build triggering based on stuff already being built, that's actually a weak spot that's on my short-term list. The current implementation is an exact build string match. That's way too sensitive and ends up rebuilding when it's not necessary. It should factor in at least 3 things:

jakirkham commented 6 years ago

Like to add on the compiler front that it would be good to sort out how we handle/what we do with the existing LLVM based stack and anything that already relies on them. IOW llvmdev, clangdev, libcxx, openmp, and the pending flangdev. Also there is some discussion on adding LLVM's libunwind and have the non-GNU libunwind already packaged, which may need to be examined in this discussion.

cc @conda-forge/llvmdev @conda-forge/clangdev @conda-forge/libcxx @conda-forge/openmp

mingwandroid commented 6 years ago

Why is dev being added to all these package names? I cannot see any reason to do that. You wouldn't name gcc as gccdev for example. What's wrong with flang? Also why was openmp not identified with an llvm- prefix? Names are important and I'd like to know how I can throw my $0.02 in before these things get set in stone.

I have ended up having to create metapackages to point back to these names, e.g.:

https://github.com/AnacondaRecipes/boost-cpp-feedstock/blob/master/recipe/meta.yaml#L139-L159

With openmp, we might have wanted to use that name as a variant selecting metapackage but can't now.

msarahan commented 6 years ago

As Ray says, I recommend creating metapackages for these. They would point to compiler activation packages (e.g. clang_osx-64), which in turn point to "meat" compiler packages - the actual executables and all that.

The more we can agree on flags and centralize compiler packages (not duplicate packages on both conda-forge and defaults), the better (for the sake of cross-channel compatibility). I understand the arguments for keeping conda-forge self-contained, but I do feel like some compromise may be helpful here. We should discuss this in a call, though. Comments aren't really adequate.

What Ray is talking about regarding openmp is that there is more than one implementation. Intel also has one. Over time, it will be wise to ditch the existing meaning of the openmp package name, and use that name instead as a metapackage that can select different implementations.

SylvainCorlay commented 6 years ago

Names are important and I'd like to know how I can throw my $0.02 in before these things get set in stone.

The Debian community has very strict naming conventions for packages. For example, python packages are generally named python-pypi_package_name which prevents conflicts with ruby, perl and whatnot.

As conda becomes used more and more as a general-purpose package manager, instead of being focused on python, a conda-based distribution would benefit a lot from using such naming conventions.

msarahan commented 6 years ago

I doubt you'll find any dissent to that, but we also have to have a comprehensive plan for migrating from old names to new ones. The metapackages Ray and I have talked about are how we think it needs to be done. The other question is the timescale for declaring old names invalid. While perpetual maintenance may seem friendly to users, it is burdensome and also precludes using old names for new purposes like openmp here.

On Nov 2, 2017 7:43 AM, "Sylvain Corlay" notifications@github.com wrote:

Names are important and I'd like to know how I can throw my $0.02 in before these things get set in stone.

The Debian community has very strict naming conventions for packages. For example, python packages are generally named python-pypi_package_name which prevents conflicts with ruby, perl and whatnot.

As conda becomes used more and more as a general-purpose package manager, instead of being focused on python, a conda-based distribution would benefit a lot from using such naming conventions.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/conda-forge/conda-forge.github.io/issues/460#issuecomment-341409428, or mute the thread https://github.com/notifications/unsubscribe-auth/AACV-WUc7Bz9dC3K4GEUgkHiZVnIH19rks5sybkFgaJpZM4PwfYd .

msarahan commented 6 years ago

I wrote up a CEP for the renaming issue at https://github.com/conda/conda/wiki/CEP-2:-Renaming-packages. Feedback welcome.

SylvainCorlay commented 6 years ago

Thanks, that would be quite awesome!

ChrisBarker-NOAA commented 6 years ago

All good, thanks folks!

But frankly, we've been really lax about using proper conventions for naming new packages too!

for instance, as a rule, new python packages in conda forge generally simply get their PyPi name, Given conda's legacy, maybe should keep doing that, but there should be a clear policy.

jakirkham commented 6 years ago

So given the big topics this issue is trying to tackle, I'd like us to avoid digressing into a general discussion about naming. Not to say that isn't a big topic worthy of attention. Just there is a long history behind that one already and it really deserves its own thread. Maybe issue ( https://github.com/conda-forge/conda-forge.github.io/issues/18 ) or a new one continuing off of that issue. Could be called "Naming revisited". To be clear happy to have that discussion continue in tandem. Just not in this thread. Thanks in advance for understanding.

jakirkham commented 6 years ago

That said, the naming of the compiler packages mentioned is definitely relevant for this thread and should be discussed. Responding to the points raised by @mingwandroid on this front below. Sorry if I missed anything. Much to say/respond to on this front. Tried to respond per package. Then some musings on metapackages and more generally where we stand on the compiler front at the end.

As to your question about llvmdev, here is the original PR ( https://github.com/conda-forge/staged-recipes/pull/1482 ) that added it. Basically a year old. We can all probably benefit from skimming it at least. The name switched several times. I believe the dev was added because it builds static libraries and version was left out of the name as the libraries are static. I think the same is true of clangdev, which was added in PR ( https://github.com/conda-forge/staged-recipes/pull/1481 ). Though @inducer will remember better than I.

We added openmp in PR ( https://github.com/conda-forge/staged-recipes/pull/3171 ). We debated openmp (Homebrew uses this) vs. llvm-openmp (didn't find anyone using this, but it must be out there) vs. libomp (Linux distros use this) in this thread. In the end we went with openmp as we saw no conflict with gcc as it would likely use libgomp. There wasn't much concern with an intel conflict at the time even though I raised it. Probably as it was unclear if Intel packages would exist in conda-forge (though this is speculation as to the lack of interest from others). There has been some interest in changing this both on the conda-forge and Intel ends, but that is a totally different discussion that should be separate ( https://github.com/conda-forge/staged-recipes/issues/3022 ) and probably needs a meeting as well. Digression aside we stuck with openmp as it matched the package name and was the most clear to people. Not attached to it though, but renaming would break a few things now (not many though).

flangdev is still in the works. Guessing the name also relates to static libraries as well. Though I have not been close to that work. My gut feeling is much of that is still in flux and guessing the name could be changed. There is not a staged-recipes PR for it yet. Would have to talk with @isuruf for details.

The metapackage idea is interesting. Particularly with openmp. Though it feels like features are still pretty broken. What the status is of the build string in conda-build 3. That said, tweaking the build string still feels like a hack (admittedly better than nothing). Is there any other work on this front in conda/conda-build that would improve package selection.

Though I guess the question I have right now is are we only concerned about the names of these compilers. If so, that makes me feel a lot better about where we stand, but I feel this is false confidence. We probably need to take a look at how LLVM and friends are built in both channels before we can really assess what if anything needs to change. I hesitate to spin this off as a subthread/new issue yet, but maybe we should be thinking about that too.

mingwandroid commented 6 years ago

OK, let's save naming for another time, but I will say adding dev just because there are static libs seems silly to me. Lots of things have static libs. The most important thing about names apart from avoiding conflicts, is not violating the principle of least surprise.

Anyway ...

The clang recipe is available here: https://github.com/AnacondaRecipes/aggregate/tree/master/llvm-compilers-feedstock/recipe

It is huge and complicated. It generates separate packages for llvm-lto-tapi, cctools, ld64, clang, clangxx, llvm, polly, libcxx, libcxxabi, libunwind, llvm-openmp and compiler-rt.

The complexity was necessary due to needing to build on macOS 10.9 but to use a more recent clang than macOS 10.9 comes with to be able to build clang 4.0.1 - we use an official clang binary here - and while those clang binaries do run on 10.9 they need a more recent version of cctools (specifically libtool and ranlib) than macOS 10.9 provides. Building cctools and ld64 requires two parts of llvm (libLTO and libtapi) so we must build bits of llvm then build cctools then build the rest of llvm and clang. Then after that, I re-build everything again to ensure that nothing has leaked in from the system tools nor from the original official 4.0.1 clang binaries.

So overall, I would recommend we take a different approach. I propose to make this recipe build a set of packages suffixed with -bootstrap. No one need ever worry about how to build those (unless they want to join my pain). It does build fine and is repeatable (I must have built these about 20 times during development), but there is so much stuff there that making changes is always a bit of a minefield.

I don't know if people care about having cleaner recipes to rebuild cctools and ld64 packages (Apple's binutils basically), but if so then we need a recipe for llvm-lto-tapi that depends on the -bootstrap toolchain and a single split-package recipe for cctools and ld64 that depends on llvm-lto-tapi and the -bootstrap toolchain, then a second recipe for all of the rest (including anything that can be built in-tree with the latest llvm code, so Julia is probably out but flang and rust are probably fine).

Doing it this way means we can free this new llvm split-packages recipe from the hoops we need to jump through to modernize a macOS 10.9 toolchain and from the Apple specific cctools and ld64 bits which should make it much easier for anyone familiar with 'normal' llvm to hack on it. If the bootstrap compilers ever need to be updated then I'll take that on.

msarahan commented 6 years ago

with https://github.com/conda-tools/conda-build-all/pull/93 and conda-build 3.0.28, conda-build-all is now compatible with conda-build 3.

mingwandroid commented 6 years ago

Before I spend time on this I need to know if it's an acceptable way forward.

Specifically, the conda-forge llvm, clang, cctools and the ld64 build.sh would need to install some new 'bootstrap' compilers (which are just renames of these ones) into a temporary location, add that to PATH and manually run the activation scripts then the outputs from building these recipes would become the compilers we all use for macOS.

I would also like to see consensus on how the recipes are split into packages (i.e. what goes into the sub/split-packages, from my perspective a single recipe should build as much as possible that's cross-platform or intends to become cross-platform).

isuruf commented 6 years ago

Why not use the compilers from defaults? If we try to build everything in one recipe, CI won't be able to build them.

mingwandroid commented 6 years ago

Why not use the compilers from defaults?

I do want to use them but to use them to build clang and llvm and the rest this needs special attention (manual installation and sourcing of activate scripts). The compilers are inextricably interwoven with the llvmdev and clangdev packages. They cannot co-exist and cannot be built with one depending upon the other without using the always_include_files hack and that would be horribly brittle.

If we try to build everything in one recipe, CI won't be able to build them

The overhead of adding things to an llvm recipe is pretty low since the majority of build time is for llvm itself.

Even if it does exceed CI limits I do not think that should be allowed to limit the quality of how we construct these recipes. We make exceptions to upload binaries for other packages that have long builds and I believe the quality of the llvm recipes is very important.

isuruf commented 6 years ago

I do want to use them but to use them to build clang and llvm and the rest this needs special attention (manual installation and sourcing of activate scripts).

What I'm suggesting is to use the defaults compilers if we want to compile something, but use the conda-forge packages if we want llvm to be linked to a package which would need to JIT compile a function like llvmlite and pocl which needs access to libclang

The compilers are inextricably interwoven with the llvmdev and clangdev packages. They cannot co-exist and cannot be built with one depending upon the other without using the always_include_files hack and that would be horribly brittle.

Okay. So is this a macos only issue where the compiler is also clang? Can the compilers be installed somewhere other than PREFIX, so that we can build llvm?

The overhead of adding things to an llvm recipe is pretty low since the majority of build time is for llvm itself.

building clang takes a lot of time as well.

Even if it does exceed CI limits I do not think that should be allowed to limit the quality of how we construct these recipes. We make exceptions to upload binaries for other packages that have long builds and I believe the quality of the llvm recipes is very important.

One other issue is that if we wanted a different variant of one package, what do we do? For example, flang needs patches for clang and cling needs patches for llvm

mingwandroid commented 6 years ago

What I'm suggesting is to use the defaults compilers if we want to compile something, but use the conda-forge packages if we want llvm to be linked to a package which would need to JIT compile a function like llvmlite and pocl which needs access to libclang

The 'but' is the issue, all of the time this is an 'and'.

Okay. So is this a macos only issue where the compiler is also clang?

The compiler is always clang on macOS and I want to enable it on Linux as soon as possible too (though not as the default). Then I want to investigate using the clang GCC frontend with the clang MSVC backend on Windows to allow us to build things that use the GNU stack with Windows UCRT support (probably in a year or two!).

Can the compilers be installed somewhere other than PREFIX, so that we can build llvm?

By this do you mean that the compilers are built to install to something like ${PREFIX}/clang so that is added to PATH whenever something needs to use these compilers? If so I'd rather not, and I have explained what I think is a better approach in some detail.

One other issue is that if we wanted a different variant of one package, what do we do? For example, flang needs patches for clang and cling needs patches for llvm

Presumably the patches are intended to end up upstream? If so they should not break other things, therefore you might be able to use the superset. Please point me at the patches and I will investigate as best I can.

isuruf commented 6 years ago

The 'but' is the issue, all of the time this is an 'and'.

I'm not sure what you mean by this.

By this do you mean that the compilers are built to install to something like ${PREFIX}/clang so that is added to PATH whenever something needs to use these compilers? If so I'd rather not, and I have explained what I think is a better approach in some detail.

Can you explain your reasons as to why this is not viable?

Presumably the patches are intended to end up upstream? If so they should not break other things, therefore you might be able to use the superset. Please point me at the patches and I will investigate as best I can.

See https://github.com/flang-compiler/clang and https://github.com/flang-compiler/clang/pull/38 for the windows patches on top of that. (These are for 4.0 branch, there are branches and PRs for 5.0 branch as well)

Btw, what's your suggestion for linux and windows?

msarahan commented 6 years ago

The 'but' is the issue, all of the time this is an 'and'. I'm not sure what you mean by this.

I think what Ray means is that he wants people to use the same compiler all of the time. He thinks that we need to use our compiler that we have right now as a bootstrapping compiler to build the CF recipe, and then use the CF recipe to build everything. There is no "use this compiler for A, and this compiler for B" - only how you bootstrap one compiler. That one compiler is the CF recipe, and the bootstrap compiler is our new one.

Usually, setting something like $PREFIX/bin/clang is a bad idea because clang is coded to detect the target platform from the clang executable filename. Keeping this lined up for cross compiling is important, I think. @mingwandroid may have more to add.

For Windows, I think we're sticking with MSVC activation scripts. Anaconda will continue to use the Intel fortran compiler for now. I think we could consider flang replacing it once it's proven.

For linux, is there any reason not to use Anaconda's new gcc packages? That doesn't encroach on any existing work in CF, so I don't see any conflict here. It's just replacing devtoolset2 with the new stuff.

isuruf commented 6 years ago

For windows and linux, I was asking about llvm and clang.

mingwandroid commented 6 years ago

Seems flang is nothing but a divergent fork of clang? Urgh, shame. OK, we should ignore flang (from the perspective of building it at the same time as the rest of the llvm ecosystem) until they have upstreamed their work, which will probably happen once they get sick of rebasing.

mingwandroid commented 6 years ago

Btw, what's your suggestion for linux and windows? For windows and linux, I was asking about llvm and clang.

I don't know what you mean, nothing special needs to be done for Windows or Linux since they do not use clang as the default system compilers. Please explain.

isuruf commented 6 years ago

I don't know what you mean, nothing special needs to be done for Windows or Linux since they do not use clang as the default system compilers. Please explain.

I mean, what happens to the recipe? Is that going to be the same as in Mac OS X?

OK, we should ignore flang (from the perspective of building it at the same time as the rest of the llvm ecosystem) until they have upstreamed their work, which will probably happen once they get sick of rebasing.

You mean don't package flang at all in conda-forge?

msarahan commented 6 years ago

If it's already packaged, why remove it? I think the tricky question is in maintaining binary compatibility, and that's all we're concerned about here. We don't want to take away useful tools. We do want to make software between defaults and CF compatible.

So, if flang is used on windows for odd things, fine. If flang is used on windows for all fortran code (numpy, blas, etc), then we need to figure out how to unify.

isuruf commented 6 years ago

If flang is used on windows for all fortran code (numpy, blas, etc), then we need to figure out how to unify.

That's the goal. flang can compile the reference-lapack without any issue and all tests pass. OpenBLAS can be compiled (with assembly routines and fortran files). SciPy can be compiled. (3 tests from the full test-suite fail though)

mingwandroid commented 6 years ago

I've asked this before. How do you hope to use flang on Windows when it requires using the UCRT? What do you do about numpy for Python 2.7? Do you wait until the end of Python 2.7?