Organising the conda communities and establishing best practices.

bgruening commented 8 years ago

We all love conda and there are many communities that build awesome packages that are easy to use. I would like to see more exchange between these communities to finally share more build-scripts, to develop one best-practice guide and finally to have channels that can be used together without breaking recipes - a list of trusted channels with similar guidelines.

For example the bioconda community - specialised on bioinformatic software. They have some very nice guides how to develop packages, they are reviewing and bulk-patches recipes if there are new features in conda to make the overall experience even better. ping @johanneskoester, @daler and @chapmanb from BioConda fame

Omnia has a lot of cheminformatic software and a nice build-box based on phusion/holy-build-box-64 + CUDA and AMD APP SDK. ping @kyleabeauchamp, @jchodera

With conda-forge there is now a new one and it would be great to get all interested people together to join forces here and don't replicate our recipes or copy them from one channel to the other just to make them compatible.

Another point is that we probably want to move recipes to default at some point and deliver our work back to Continuum - so that we can benefit from each other.

I can imagine that we all form a group of trusted communities and channels and activate them by default in our unified build-box - or we have one giant community channel. All this I would like to discuss with everyone that is interested and come up with a plan how to make this happen :)

What do you all think about this? As a next step I would like to create a doodle to find a meeting data where at least one representative from all communities can participate.

Many thanks to Continuum Analytics for there continues support and the awesome development behind scientific python and this package manager. ping @jakirkham @msarahan

bgruening commented 8 years ago

@kyleabeauchamp, @jchodera, @jakirkham @msarahan, @johanneskoester, @daler, @chapmanb, @jxtx, @jmchilton: please feel free to ping others and invite them :)

jchodera commented 8 years ago

Definitely interested in learning more! For now, pinging @rmcgibbo, @mpharrigan, @cxhernandez, @marscher, @franknoe, @pgrinaway, @bas-rustenburg.

ocefpaf commented 8 years ago

@bgruening thanks for the message! In fact we just discussed that yesterday!!

Conda-forge was born from two communities similar to bioconda and omnia (the SciTools and IOOS channels) with the goal to reduce redundancy and join forces to produce high quality recipes and binaries. I would love to see more communities join us here. We are not the dark side but we do have cookies :wink: (Well... a cookie cutter... sorry for the false advertisement.)

I am trying to put a blog post online next week with more info. We are also planning on public (Google?) hangouts so we can have some online face-time and QnA sessions.

Meanwhile feel free to ask anything here, or in new issues, if the you have a very specific question.

Here is the gist of conda-forge:

we are a community driven conda packing loving geeks;
the entry step is this repository (staged-recipes). Contributors must add their recipes here via PRs;
once the PR is reviewed and merged a feedstock (an individual repo) is created for that recipe;
each feedstock has a team of maintainers. The maintainers are specified in the recipe via the extra/maintainers field;
the feedstock creation and the GitHub happens automagically;
we have a cookie cutter/do-it-all tool called conda-smithy. We use that to lint the recipes, update the feedstocks, and some convenience tools to work with the many repos model;
the maintainers "job" is mostly to update the software version, and merge an eventual auto maintenance PR like CI config updates.

There are many details I am leaving out and much more to talk about, but I will stop here for now.

The number one question we get is: why multiple repositories instead of one with all the recipes? We had (and still have) many discussions like this. However, all I have to say is: We tried the single repo model and now we are trying the multiple repos model. So far, the multiple repos has scaled much better, and none of the major worries we had became true.

jchodera commented 8 years ago

This sounds great. @rmcgibbo is much more qualified to comment than I am here---he pioneered most of the omnia conda framework---but we ended up converging on our own build system (modeled loosely on conda/conda-recipes simply because we weren't aware of any other way to tackle this.

Where should we look for all the gory technical details about the build systems and automation? This was the hardest part for us, since we needed (1) broad platform support (hence the use of a phusion/holy-build-box-64 build system for linux, (2) CUDA and OpenCL support (via the AMD APP SDK), and (3) automated builds in reproducible environments for win, linux, and osx. We're also trying to keep old versions of packages live for scientific reproducibility---we frequently publish code with our papers and provide environment.yml files to ensure reproducibility with identical versions. Our approach started with actual local hardware and evolved to use cloud services (currently travis-ci and AppVeyor.

I'd love to understand more about how the conda-forge build system differs from what we currently use in omnia's build system.

jakirkham commented 8 years ago

We are not the dark side but we do have cookies :wink:

Which ones? For humans or browsers? :laughing: Ok, it was terrible, but I had no self-control.

Yes, welcome all. :smile:

Please feel free to peruse what is going on at conda-forge and ask questions. The best place to get acquainted with or propose general discussion topics is probably the website repo (in particular the issue tracker). There are many issues there that are likely of interest and welcome to healthy discussion of thoughts and personal experiences. Also, there may be a few closed issues there worth reading up on just to get a little bit of history (we are still quiet young :wink:).

If you would like, feel free to submit a simple recipe or a few to get a feel for how everything works here. Also, feel free to check out our gitter channel for any generic questions or you may have.

Once everyone has had a chance to get a feel for how everything works and what seems personally relevant, we can figure out meeting discussion topics in some place YTBD.

Again welcome.

jakirkham commented 8 years ago

Welcome @jchodera.

Where should we look for all the gory technical details about the build systems and automation?

This varies depending on the question. Let's try and direct you based on the points raised.

(1) broad platform support (hence the use of a phusion/holy-build-box-64 build system for linux

This issue has basically moved in the direction of various proposals of how to move the Linux build system forward. Though there is current strategy in place, as well.

(2) CUDA and OpenCL support (via the AMD APP SDK)...

This is under active discussion. The reason being this is tied to several issues including build system constraints, how features work, how and what of these libraries get distributed, etc.. See this issue. There is a proposed example there of how we might get this to work. However, we haven't settled on something yet.

(3) automated builds in reproducible environments for win, linux, and osx.

This is all over the map. :smile: In general, we use AppVeyor (Windows), Travis CI (Mac), and Circle CI (Dockerized Linux)

If you just want to read code, we can point you there. Proper documentation isn't quite there yet. Also, there isn't one singular issue for this, but it is discussed at various points in various issues. What sort of things would you like to know?

daler commented 8 years ago

Hi all, checking in from bioconda. I've been poking around the conda-forge code and can't pin down where the magic is happening. Could you point to some code or to a description of what's happening to aggregate the one-recipe-per-repos?

To further the discussion, here's a description of the bioconda build system and where you can find the code.

Contributor submits PR
.travis.yaml calls scripts/travis-setup.sh on OSX and Linux, which starts a docker container if linux or does the OSX setup otherwise
scripts/build-packages.py is run. This does most of the work, specifically:
- Up-to-date recipes are skipped. This lets us support 800+ recipes with very little overhead
- Any recipes that need to be built are toposorted and added to a local file:// channel after being built. This ensures that a single PR can contain a batch of interdependent recipes and everything gets built correctly.
- Recipe builds are treated as nosetests and therefore get the nice infrastructure that comes along with that for free (hiding stdout when not needed; keeping track of how many failures, etc)
- If we're on a PR branch, don't upload to anaconda

The workflow is just like most anything else on github: submit a PR and wait for it to be tested. Once it passes, someone on the team merges it into master. Upon merging, travis-ci then runs again but on the master branch and this time upon completing, the built packages are uploaded to anaconda.

Aside from differences in the moving parts of the build systems, it sounds like we're all dealing with similar issues with respect to CUDA and gcc, etc. Would be nice to work out some best-practices that we could all use.

jakirkham commented 8 years ago

Welcome @daler.

Could you point to some code or to a description of what's happening to aggregate the one-recipe-per-repos?

Sorry I'm not following this question. Could you please clarify what you are meaning by aggregate? It is a little unclear and I am a bit worried that there may be some misunderstanding of what is going on here. I'll try to clarify the big picture below.

To further the discussion, here's a description of the bioconda build system and where you can find the code....

Yes, SciTools and IOOS behave in a similar manner. However, those recipes along with many from conda-recipes are being ported over here as people from those groups seem to like this model.

Just to clarify, the model for building is very different here than the many recipes in a single repo. The reasons are varied, but I think the biggest difference is it allows people to take ownership of recipes/packages that are important to them and the tools (CIs) used to test, build, and deploy. This includes making bug fixes, releases, feature support, etc. Similarly it allows relevant discussion to break along those lines. In practice, this appears to be a huge asset. However, there are plenty of other reasons for one to consider this model.

How this works:

Propose recipe(s) in a PR with a list of maintainers to staged-recipes (here :wink:).
Iterate on the recipe with CIs and reviewers (at least one of them is automated :smile:).
PR is merged and generates repo(s) each with a single recipe and all the needed build tools.
Automatically, recipe maintainers are given commit privileges and control of relevant build tools.
Automatically, the first build is made and released (assuming all dependencies are available).
Do what is needed to maintain your recipe.
Periodically merge CI tool maintenance updates.

While understanding this infrastructure may at first seem daunting, it is actually not so bad and is not really necessary. However, if you are curious, we are more than happy to explain the details.

Maybe if you could please rephrase your question in terms of these steps, we can do a better job at answering your questions and providing you places to look for more information.

Aside from differences in the moving parts of the build systems, it sounds like we're all dealing with similar issues with respect to CUDA and gcc, etc. Would be nice to work out some best-practices that we could all use.

Absolutely, we would be happy to point you to relevant issues where these are being discussed. Just please let me know which of these you would like to know more about.

msarahan commented 8 years ago

@daler, aggregation is done at the https://github.com/conda-forge/feedstocks/tree/master/feedstocks repo. This is created with conda-smithy, particularly this module: https://github.com/conda-forge/conda-smithy/blob/master/conda_smithy/feedstocks.py

Continuum is very interested in this particular aspect (I am Continuum's representative here, though others are also involved in contributing recipes and discussing build tools). The one-repo-per-recipe model is necessary, I think, for two reasons:

- keep the load on the CI services small, and avoid their log size and build time limits
- divide responsibilities and authority for each recipe with much finer granularity

The latter is the bigger issue here, since you all have had reasonable success with CI.

Continuum has started a community channel (https://anaconda.org/pycommunity), with the long-term plan to have that as a package aggregation center. In my mind, the most important facet of this effort is to unite the recipes and have a single canonical source for each recipe. I don't care whether it's on some project's page (i.e. matplotlib), or on conda-forge, or whatever - so long as one place is the official source, and finding that source and contributing to it is straightforward. Conda forge is a great place to host recipes because it provides the CI of those recipes, and I like the distributed maintainer model, but I also think that hosting recipes directly at projects, and having conda-forge build from indirectly-hosted sources would be the ideal - that way the recipe would be holistically managed by the package originators.

For the pycommunity channel, we'll mirror or link packages from other channels. In the case of multiple package sources, we haven't quite figured out how to prioritize them (activity level? origin of package?) The hope is that rather than many organizations having to say "add our channel!" we'd instead have just one, and that one may be enabled by default for some "community edition" of miniconda/anaconda - or otherwise could be enabled with conda install pycommunity

daler commented 8 years ago

@jakirkham and @msarahan thanks for your pointers. One missing piece for me was that submitting a PR to staged-recipes triggers the CI (only travis, right?) to call .CI/create_feedstocks, which sets up the infrastructure, tokens etc via conda-smithy and transforms the repo into something similar to what's in the feedstocks repo of submodules. Is that correct?

@msarahan -- Wholeheartedly agree that a single canonical source for each recipe is critical, and that finding that source and contributing needs to be straightforward. conda-forge/conda-smithy and pycommunity look like great tools to make that happen.

jakirkham commented 8 years ago

@jakirkham and @msarahan thanks for your pointers.

Glad to help, @daler. Hope it wasn't too much. Just wanted to make sure we had common context for our discussion. :smile:

One missing piece for me was that submitting a PR to staged-recipes triggers the CI (only travis, right?)...

When a PR is submitted all CIs (Travis/Mac, Circle CI/Linux, AppVeyor/Windows) are run and used to attempt to build the recipe, but do not release it.

...to call .CI/create_feedstocks which sets up the infrastructure, tokens etc via conda-smithy and transforms the repo into something similar to what's in the feedstocks repo of submodules. Is that correct?

Once the PR is merged, a Linux job in the Travis CI build matrix does the setup for the feedstock. It goes something like this for each recipe unless otherwise specified (steps 7, 8, and 9).

Uses GitPython to create a new git repo on the VM with a few things committed.
1. Cleaned up recipe
2. Readme
3. License
4. conda-forge.yml
5. CI files and CI support files
6. .gitignore
Create an empty repo on GitHub (via PyGithub).
Push to the GitHub feedstock repo.
Configure all of the CIs for the repo on GitHub.
Commit info to CI configuration files for uploading binaries to Anaconda.org.
Push to the GitHub feedstock repo.
Delete all the recipes from the staged-recipes and commit.
Push to the GitHub staged-recipe repo.
Trigger a global feedstock update.

As you have mentioned, this all basically happens through conda-smithy. However, there is some code that lives here for that purpose too. Take a look at this log for configparser and entrypoints to get a better idea.

After generating a feedstock, a global feedstock update is run. It is pretty simple. It updates the feedstocks with the latest commit of each feedstock on master at conda-forge. It also updates the listing. However, changes may not be reflected in the listing immediately even if the changes have been made to the HTML source code due to how GitHub caches GitHub Pages.

daler commented 8 years ago

Perfect, these were just the kinds of details I was looking for. Thanks. Hopefully it can help get others up to speed as they join the discussion as well.

johanneskoester commented 8 years ago

Hi guys, thanks for initiating this. It is very interesting to exchange ideas of how to build. I have two questions:

Have you every considered using the anaconda build service? I recently had a look at it, and it seems to me centered on packages instead of repositories/organizations, which is kind of unfortunate, because it needs to be set up for each package, right?
With your conda-forge model, how do you deal with dependencies between recipes?

msarahan commented 8 years ago

Have you every considered using the anaconda build service? I recently had a look at it, and it seems to me centered on packages instead of repositories/organizations, which is kind of unfortunate, because it needs to be set up for each package, right?

Yes, especially for Windows builds. Mapping conda-forge's model to Anaconda.org should be OK - the organization would be conda-forge, and each package would be a different build. Maybe I'm missing how this is different from the other CI services? Anyway, the hangup has been that anaconda.org has some kinks that need to be worked out.

With your conda-forge model, how do you deal with dependencies between recipes?

ATM, I think the answer is "we don't." There has been discussion about coming up with networkx-driven guidance of what recipes to work on next, but that has been for human consumption more than automated buildout of dependency trees. Before getting involved in conda-forge, Continuum developed a build script that also uses networkx, and builds out these trees. That code assumes a single folder of packages, which can be created from conda-forge using conda-smithy. The dependency building code is part of ProtoCI: https://github.com/ContinuumIO/ProtoCI/blob/master/protoci/build2.py

johanneskoester commented 8 years ago

Thanks for the clarification. My point is the following: if the anaconda build service could be setup per repository and not per package, CI job limits are no reason any more to have separate repositories per recipe, right?

msarahan commented 8 years ago

I think separate repos per recipe are still a good thing, because it gives you complete control over who has permission to accept changes to a recipe. I don't know how we'd do that with many recipes under one umbrella.

jakirkham commented 8 years ago

Before getting involved in conda-forge, Continuum developed a build script that also uses networkx, and builds out these trees. That code assumes a single folder of packages, which can be created from conda-forge using conda-smithy. The dependency building code is part of ProtoCI: https://github.com/ContinuumIO/ProtoCI/blob/master/protoci/build2.py

Would this work on the feedstocks repo possibly with some tweaks? This might be a good way to get things going and it would also avoid having several scripts created here that kind of do something like this. Thoughts?

msarahan commented 8 years ago

Sure, I think so. It would need to be adapted to look into the nested recipes folder, but I think otherwise, it would work fine. It may also have trouble with jinja vs. static version numbers - but again, that's tractable.

johanneskoester commented 8 years ago

@msarahan I agree, this is in general a nice advantage. I asked, because the situation is different for bioconda. There, we have a rather controlled collaborative community, and it is much more convenient to have all recipes in one repository (e.g. for toposorting builds).

msarahan commented 8 years ago

Yeah, the one thing we don't have figured out well yet is how to edit multiple recipes at once. For aggregating them and building them as a set, I think conda-smithy + ProtoCI abstract away the difficulties with one repo per recipe.

johanneskoester commented 8 years ago

But if you build them as a set, you have the problem with job limits in the CI again, haven't you?

jakirkham commented 8 years ago

Yeah, I figure the nested directory structure needs to be addressed. Otherwise adding jinja template handing is probably valuable no matter where it is used, no?

msarahan commented 8 years ago

adding jinja template handing is probably valuable no matter where it is used, no?

Absolutely. In case you missed it, @pelson has a nice snippet at https://github.com/conda-forge/shapely-feedstock/issues/5#issuecomment-208377012

jakirkham commented 8 years ago

But if you build them as a set, you have the problem with job limits in the CI again, haven't you?

Well, one could consider some sort of debouncing to handle this. Namely even though one has made the change together and one is submitting them all ultimately, we manage submissions/builds somehow so that they are staggered. This will likely require some thought, but it is useful for some workflows with the recipes.

msarahan commented 8 years ago

But if you build them as a set, you have the problem with job limits in the CI again, haven't you?

With anaconda.org, we don't have artificial limits. There are still strange practical limits - like logs that get too large end up making web servers time out. These are tractable problems.

jakirkham commented 8 years ago

Interesting, thanks for the link. I'll take a closer look.

johanneskoester commented 8 years ago

@msarahan, I know, you don't have these limits, but my understanding was that anaconda.org cannot out of the box build recipes as a set, right? You have to register an individual trigger for each of them? And then, their order of execution is no longer determined, and they can't depend on each other. Or am I missing something here?

msarahan commented 8 years ago

@johanneskoester there would need to be some intermediate representation as a collection of recipes. Then that ProtoCI tool would then be able to build things that changed. It is written to build packages based on which packages are affected by a git commit. Here, obviously only one recipe could trigger, rather than many changing at once. That does not affect its ability to build requirements dependencies, though - and they'll be built in topological order.

johanneskoester commented 8 years ago

@msarahan, maybe this is not the right thread (I don't want to bother the rest with my detail questions here, so feel free to stop me if you feel this becomes off-topic). Ok, but protoCI has to run e.g. on travis, right? That means, even if protoCI triggers builds on anaconda.org in the right order, travis would still need to wait on the results, in order to be able to report back to github? Which would result in the same timeout issues? Sorry if I misunderstand something here.

msarahan commented 8 years ago

ProtoCI was designed to run on anaconda.org. If anyone else wants to get it to run on Travis, that's cool, but that wasn't what it was written for. It would not be triggered by Travis or any other build - rather, it would be a new CI service in addition to or instead of the existing CI services.

johanneskoester commented 8 years ago

Great! Now it makes sense, sorry I did not know that. So, is there any documentation on how we could set up ProtoCI for bioconda? We already have a repository with multiple recipes in place. Or is that not possible yet?

msarahan commented 8 years ago

I'm not completely sure how possible it is. I have been involved, but not the one doing most of the real work. It is live on conda-recipes, and you should start with the binstar.yml there as an example, but you'll have to tweak it for your build queue:

https://github.com/conda/conda-recipes/blob/master/.binstar.yml

In short, protoci should be installed on the build workers. It is on ours, but I can't speak for how you have your queue set up. Your build script should just call the protoci-difference-build entry point.

johanneskoester commented 8 years ago

Thanks Mike, that's good news!

johanneskoester commented 8 years ago

A question regarding conda-forge: On which linux system/libc version do you build?

msarahan commented 8 years ago

That's sort of in flux. See https://github.com/conda-forge/conda-forge.github.io/issues/29

I think the current one is this: https://github.com/pelson/Obvious-CI/blob/master/obvious-ci.docker/linux64_obvci/Dockerfile

bgruening commented 8 years ago

Thanks for this healthy discussion, exactly what I wanted to trigger :) One additional question from me. Does conda-forge has any other channels (default I assume) activated during build time?

ocefpaf commented 8 years ago

Does conda-forge has any other channels (default I assume) activated during build time?

Nope. Only the default channel.

patricksnape commented 8 years ago

Just to say that I'm in favour of the single repos as it currently is in conda-forge/feedstocks. Although I didn't go as ambitious as the bioconda/ioos/scitools/omnia crowd I've also been maintaining a set of recipes that we needed for our project, menpo. Most importantly, I've been really trying to drive Windows support because so many people in Computer Vision still use Windows due to historical (Matlab) reasons. So I'm usually keen to try and help with upstream support for Windows (as well as @msarahan who has been the real Windows champion).

I'm very interested in the CUDA/OpenCL builds that you guys seem to have. I wonder if we could become to the goto place to pickup projects like Theano for deep learning?

pelson commented 8 years ago

Just to say that I'm in favour of the single repos as it currently is in conda-forge/feedstocks

Thanks for the honesty. Technically conda-forge/staged-recipes is a many recipe repository too, it is just we've automated it so that recipes immediately get deleted and added to their own repository on merge. 😉

With that in mind, you may be aware of conda-build-all which is the tool we use in this repo to build all recipes in a single repo. It (and its predecessor ObviousCI) was the tool we used in IOOS and SciTools (amongst others) to build and upload to our respective channels. Because of staged-recipes dependence upon it, we are going to need to continue to maintain that capability, so if your looking for shared tooling for the single-repo many-recipe usecase, you might want to take a look.

As you've highlighted, even if you don't favour the approach we have taken at conda-forge, there is still huge potential for us to collaborate so that we can collectively package in a consistent and coherent way. Your input so far has been exceptionally valuable, and long may it continue! 👍

bgruening commented 8 years ago

As you've highlighted, even if you don't favour the approach we have taken at conda-forge, there is still huge potential for us to collaborate so that we can collectively package in a consistent and coherent way. Your input so far has been exceptionally valuable, and long may it continue! :+1:

I would like to highlight this and raise 3 points to start a closer cooperation.

Develop a common build-infrastructure for all conda community projects. This includes best-practise guidelines of dependencies, e.g. specifying gcc/llvm, is zlib as dependency needed ...
Enable more channels by default. A good start would be if bioconda could activate conda-forge and vice versa as default channel during build.
Develop recommendations about the correct channel for a specific package. For example libraries should always be stored in conda-forge. Bioinformatic software in bioconda and cheminformatics in omnia - as long as we still want to have separate repositories.

johanneskoester commented 8 years ago

@bgruening this sounds very reasonable. General purpose libraries can go into conda-forge, but I can also imagine to just pass them over to Continuum. I am also not quite sure about whether to prefer conda-recipes or conda-forge for that... Can I expect that conda-forge PRs are handled faster than on conda-recipes?

bgruening commented 8 years ago

Can I expect that conda-forge PRs are handled faster than on conda-recipes?

This was my expectation. One huge benefit of the bioconda model is that if you need a recipe in 30min you get get those. conda-forge is hopefully way faster than conda-recipes :)

pelson commented 8 years ago

Can I expect that conda-forge PRs are handled faster than on conda-recipes?

This was my expectation. One huge benefit of the bioconda model is that if you need a recipe in 30min you get get those. conda-forge is hopefully way faster than conda-recipes :)

If you are proposing a new recipe, then you have to wait on one of the conda-forge "staged-recipes" maintainers.

If you are proposing an update to an existing recipe, then you have to wait on one of the feedstock maintainers (as listed in the recipe/meta.yaml).

If you are proposing an update to an existing recipe for which you are a maintainer, you are waiting on yourself (and maybe the CI to finish).

In general I'm keen to be very open about membership of the "staged-recipes" group. The important qualities of a maintainer in that team is an eye for detail, a feeling for what is "maintainable", and shed loads of experience of reading and writing conda recipes. I suspect most people in this thread meet that criteria, and after proposing just 3 or 4 PRs which merge smoothly, I'd be happy to say that was a good candidate for membership (though with all the noise on conda-forge, it is probably necessary to ask, rather than have it suggested).

183amir commented 8 years ago

Hey guys, I have ported a couple of my own recipes from bob.conda to conda-forge and I have used this script to port things over. It is not well written and I am sure you can write better scripts but as @jakirkham mentioned maybe sharing it with you guys could help you automate your porting process.

jakirkham commented 8 years ago

Thanks for sharing this, @183amir. :smile:

183amir commented 8 years ago

I guess the most important part is that I load recipes with ruamel.yaml and take the example recipe as base and update it with my own recipe.

johanneskoester commented 8 years ago

At bioconda, we are evaluating if it makes sense to specify the compiler in the recipe. Our feeling is that this provides some advantages, e.g.

the recipe can define constraints on the minimum compiler version
conflicts can be detected via depending on libgcc at runtime
explicit is better than implicit

We are unsure though about osx (since clang seems to be used there by e.g. the default channel). What are your thoughts on this? It looks like no conda-forge recipe is currently depending on gcc.

jakirkham commented 8 years ago

Hi @johanneskoester, sorry your comment got buried in notifications I am afraid and am now discovering while going back through some things.

Generally, our feeling here is we want to move away from using a gcc package. We have been making steps in that direction. In particular, we now use CentOS 6 with devtoolset-2 for nearly all of our building on Linux. On Mac, we require 10.7 where using the system compiler (clang) has C++11 support. This largely meets our needs at present. There are a few exceptions when dealing with OpenMP and/or Fortran. Though we may re-evaluate our strategy here in the future. See this issue ( https://github.com/conda-forge/conda-forge.github.io/issues/29 ) for more details on various proposals.

While it is a nice idea in theory to use a compiler package, in practice this doesn't fair so well. One reason is we can't make any guarantees about GLIBC compatibility on Linux as the compiler could be used to build anywhere. So, we have opted to work on proper docker containers that include a compiler in them. Another reason this doesn't work well is that if we run into an issue with the compiler package (as we did recently), we are largely incapable of fixing it due its long build time that exceeds CI limits. As a result, we can at best use kludgy hacks to try and solve the problem. In the worst case, we find ourselves crippled.

Maybe a better long term strategy that provides the same guarantees without the same issues would be to create a pseudo compiler package. This package could be used to verify that a compatible compiler can be found. Additionally, this package could be used to perform some sort of configuration to ensure the compiler it found is used. This would allow proper constraints in an explicit manner, but avoid the pains associated with the packaged compiler.

johanneskoester commented 8 years ago

Thanks for the answer! We also used devtools-2 before, but unfortunately we went in exactly the opposite direction, requiring the gcc package for all recipes that compile something. Regarding the gcc issues: They appear to be in the conda-forge gcc package, right? Why do you shadow the gcc package from the default channel at all? That one also should not have the CI issues, because Continuum builds on anaconda.org, right?

jakirkham commented 8 years ago

Thanks for the answer! We also used devtools-2 before, but unfortunately we went in exactly the opposite direction, requiring the gcc package for all recipes that compile something.

I see. Well, we are partially following @msarahan's lead here. Though maybe a bit slower than he would like. He has made a strong case for not using the packaged gcc.

Regarding the gcc issues: They appear to be in the conda-forge gcc package, right?

Nope. We tried to package it because of the problems we had with it, but we couldn't.

Why do you shadow the gcc package from the default channel at all?

As stated before, we don't.

That one also should not have the CI issues, because Continuum builds on anaconda.org, right?

Not sure I follow this question. Could you please clarify what you mean here?

conda-forge / staged-recipes

Organising the conda communities and establishing best practices. #299