Closed mrocklin closed 8 years ago
Is there a reason you can't install conda-like packages with pip? I believe pip is the default installer for Python, and is likely the only thing we will support.
Pip tends not to work well with packages that have non-Python dependencies. This includes a lot of the Numeric/Scientific Python stack (e.g. Pandas.) This community is a decently sized chunk of the Python ecosystem, and would definitely appreciate better support from RTD (which, btw, I <3, thanks!)
In regards to default installers you may find it interesting that conda
was recently blessed and brought in under the Python Packaging Authority (cc @ncoghlan).
conda is actually still its own org (since it isn't Python specific), but we do recommend it when Python folks need a cross platform package manager that can handle the Python runtime and arbitrary external dependencies. By contrast, the PyPA toolset focuses specifically on Python packages (including C extensions), including playing nice with redistributors like conda and the Linux distros.
Interesting. So it would be a way to install binaries onto the build server? Does it work with virtualenvs, or is it installing things system wide?
Exactly.
It has its own system for virtual environments that relies on linking many packages into environments. Interestingly, because it packages binaries Python is itself just a package, so it's easy to do things like quickly spin up two environments in Python 2 and 3 for simultaneous testing. Many who use it (myself included) vastly prefer it to virtualenv
, but that's subjective.
@asmeurer might have more information.
pip/virtualenv & conda work at different levels.
The PyPA tools are designed to work within a larger system that provides the Python runtime and any external dependencies. That may be a Linux distro package manager, something like homebrew on Mac OS X, or just downloading and running a binary installer from python.org.
By contrast, conda is such a larger system - rather than being Python specific, it's a full "cross platform platform", designed to manage arbitrary binaries, including Python runtimes and external dependencies. This means it doesn't integrate with other environments the way pip can, but it also means it can be used to manage components where pip will fail completely.
Oh, and to answer the "is conda system wide" question, no it isn't. It's designed to be run as an ordinary user, creating installation environments in their own directory space without needing root access.
:+1: this would be a fantastic addition.
I figured out how to make the baseline scientific packages (numpy, scipy) + pip to work together to build everything I need for my docs, but it was a real guessing game to figure out which combinations of pip packages could safely install together.
I actually tagged and released v0.1 of my package before I realized that I would not be able to build it from scratch on RTD, because it only worked when I changed the requirements.txt
file incrementally. This is obviously somewhat unfortunate and is actually pretty typical for the problems that arise when using pip to install scientific python libraries.
I agree it would be a great addition. As a matter of fact I'm struggling with installing numpy to build the docs for my project. @shoyer how did you solved the problem?
@tritemio The trick is to give the virtualenv in which you build your docs access to the global site-packages directory -- see Advanced Settings > Use system packages. RTD has numpy 1.8, scipy and matplotlib installed system wide. I setup my conf.py to print out the versions when building the docs: https://github.com/xray/xray/blob/v0.2/doc/conf.py
As for testing, to ensure that you can build your docs from scratch in a new virtualenv (each version of the docs gets its own virtualenv), try deleting the build environment: http://read-the-docs.readthedocs.org/en/latest/builds.html#deleting-a-stale-or-broken-build-environment
@shoyer thanks! Your suggestions are narrowing down my problems, hope to fix them soon ...
+1
I'd love to have conda support in RTD. In the same way that Travis CI does it. http://conda.pydata.org/docs/travis.html
FWIW, if RTD used buildpacks like Heroku and Cloud Foundry, there's a conda-buildpack that can detect a Conda environment.yml
file and spin that up. That spec supports creating environments that have both Conda and pip packages in them. If nothing else, its conda_compile
script might serve as a good reference is someone wants to take a stab at implementing this in RTD.
I am currently working on a project which uses numba and am trying to upload the project onto RTD. This cannot currently be done as it requires the llvm compiler. Is it possibilt for RTD either to install llvm and include it in the system packages under the Advanced Settings, or add support for conda?
Using buildpacks or a container tech like Docker to do builds would require fundamentally redesigning the way ReadTheDocs works. On the other hand, that might not be a bad idea at some point, especially as Docker based public cloud services with a grants program for open source projects come online
Disclosure: I work for Red Hat, OpenShift Online has a grants program that includes open source projects in its scope, our next generation architecture is based on Docker & Kubernetes, and I personally believe that hosting a valuable service like RTFD would be a great way for us to support the community. So while porting to the current OpenShift architecture likely wouldn't make sense, porting to Docker/Kubernetes would open up both Google Container Engine and a future version of OpenShift Online as hosting options.
:+1: for the addition of conda, I am interested in hosting a package depending on numba too.
While conda isn't supported, I've tried at least to disable the setup.py running and I'm having this weird error: https://github.com/rtfd/readthedocs.org/issues/1240
:+1: I know a few packages that need a more recent scipy or matplotlib for their docs build and pip install
on readthedocs fails ... conda would be a very nice solution!
Practically, won't this be necessary if you want to build older versions of your documentation that require older versions of NumPy, matplotlib, possibly with different APIs from whatever version of NumPy is installed system-wide?
@chebee7i practically speaking, matplotlib and numpy have strong backwards compatibility guarantees, so I'm not too worried about API changes for them. Though matplotlib has been talking about a 2.0 release with a new default colormap...
@shoyer but functionality does change between versions and this can cause the documentation to be wrong, especially if you rely on buildtime-generated documentation through the sphinxext IPython directive. And I am thinking much more generally than matplotlib and numpy, but even extending to just pandas reveals very recent backwards compatibility changes.
+1 for conda
support.
+1, i'm currently trying to update old docs that use both Python, R and rpy2, I can almost trivially have everything working fine in a conda 2.7 env.
Am I completely mistaken or does readthedocs allow us to use docker ( http://read-the-docs.readthedocs.org/en/latest/development/buildenvironments.html#configuration ) ( http://read-the-docs.readthedocs.org/en/latest/api/doc_builder.html#readthedocs.doc_builder.environments.DockerEnvironment )? If so, there are pre-existing images for miniconda and miniconda3 ( https://hub.docker.com/u/continuumio/ ), which could be used here.
Went ahead and updated the description here to flush out the needed work. If folks have any thoughts feel free to comment here. I should be working to add conda support in the next few weeks.
+1
Just make sure you provide conda
through some proxy, to speedup download and avoid extra charges..
I am so psyched that this is finally happening! Let me know if you need any help testing this out.
A few notes:
python=3.5
(for example) in `environment.yml’. conda update conda
.
- [ ] Munge
name
property ofenvironment.yml
to proper RTD value
Turns out you can override the name in the command using the -n
parameter. See the create
command help here:
C:\Development\Projects>conda env create -h
usage: conda-env-script.py create [-h] [-f FILE] [-n NAME] [-q] [--force]
[--json]
[remote_definition]
Create an environment based on an environment file
Options:
positional arguments:
remote_definition remote environment definition / IPython notebook
optional arguments:
-h, --help Show this help message and exit.
-f FILE, --file FILE environment definition file (default: environment.yml)
-n NAME, --name NAME environment definition
-q, --quiet
--force force creation of environment (removing a previously
existing environment of the same name.
--json Report all output as json. Suitable for using conda
programmatically.
examples:
conda env create
conda env create -n name
conda env create vader/deathstar
conda env create -f=/path/to/environment.yml
@Korijn Great -- thanks for the clarification. I will update the ticket.
I have a basic implementation working, and I'd love to test this against a repo that someone is using in the wild. @Korijn do you have a good repo that has an environment file checked into it?
If you're still looking for test repos, feel freel to pull from https://github.com/OpenHydrology/floodestimation. Does the environment.yml need to have sphinx etc in it?
@faph Great, thanks. I don't think that it will need Sphinx specified. It will use the same order of operations that we support currently:
So the plan is to run conda env update
inside of our environment, where we have already installed a base set of packages. This means you can specify a different version of Sphinx, etc. but we will have a default set that is used if they aren't specified.
Sounds sensible, thanks.
@faph hmm, I'm getting a Error: Invalid package specification: appdirs 1.4*
from conda now.
As far as I can tell, the format for the environment file isn't documented anywhere :/ I'm using conda 3.18.3
, which I believe is up to date. Does that work locally for you?
There's some documentation for environment.yml
here: https://github.com/conda/conda-env#environment-file-example
It looks like @faph provided you with an invalid file -- it's missing some equals signs. This version works:
name: env
channels:
- https://conda.anaconda.org/openhydrology
dependencies:
- python
- appdirs=1.4*
- sqlalchemy=0.9*
- numpy=1.9*
- scipy>=0.16
- lmoments3>=1.0.2
Sorry! Thanks. I always get this wrong between meta.yml
, environment.yml
and conda install x
!
Will update in the repo. ... done.
Hrm, I've run into another issue, where the conda env update
command doesn't accept the --prefix
argument. It only accepts a name, which doesn't seem to let you override where that environment might be stored (it defaults to $HOME/miniconda/envs
, but doesn't seem to allow overriding this path).
I'm looking into this more, but wonder if anyone has thoughts here. Ways to fix this:
--prefix
instead of a --name
to the conda env update
commandThe other option is to create the environment all at once, but that would then override all the packages that we install (Sphinx, etc) to versions that we specify, I believe.
You could do an additional call to conda install
afterwards to make sure the right versions of sphinx etc are installed?
I'll provide a sample repo soon, sorry for the delay. :)
Suppose you have a conda environment installed at $CONDA_ENV_PATH
. In my tests, $CONDA_ENVPATH/bin/conda env update
has some weird behavior: it both updates the conda environment the command is run from AND installs a new environment at the name provided in environment.yml
. This must be a bug...
You should be able to set envs_dir
in conda config. If you set that to your preferred path you should be able to create and update envs as per original plan. Just use --name instead of --prefix.
Sounds like CONDA_ENVS_PATH
is what I was looking for, for overriding where it looks for the named environment. This is a hacky solution, and I'd prefer to just use the --prefix
, like in the conda env creation command, but that will at least give a path forward for now.
@shoyer Hmm, that definitely sounds like a bug.
I will hopefully have this at least to the point where I can post a Work in Progress PR in the near future.
Before creating the environment, you can just do:
conda config --add envs_dirs path/to/envs
Then you can just conda env create/update --name rtd_env --file path/to/environment.yml
.
I will hopefully have this at least to the point where I can post a Work in Progress PR in the near future.
Sounds exciting, @ericholscher.
So, there are many ways to install stuff with conda
and I'm trying to get a grasp on how this is going to work. I see the environment.yml
file has been discussed. There is also a bdist_conda
subcommand for python setup.py
to build a conda
package. Finally, some cases provide a *.recipe
or otherwise named directory with a recipe and scripts to build the package with conda-build
. In the last two cases, the package must then be explicitly installed after building, but it will pull all runtime dependencies with it. If there are other cases, I have missed feel free to add.
How do you think these cases should be handled? Is there some way for us to specify to use how we would prefer it to be installed. In the simplest case, it could be a shell script, but if you have some better ideas I would be interested to hear.
The most common approach seems to be to create first the environment with requirements (either using a full environment.yml
file or requirements file) and then to install the package into that with python setup.py install
. See for example http://conda.pydata.org/docs/travis.html
The alternative is first to build the package using the conda recipe (the folder that often has the word recipe
in it). Then install that into a new environment using the --use-local
option. I think this option is mostly used for people who want to test building the conda package (and optionally deploy it to say anaconda.org). I think this option is overkill for RTD. Also, the recipe sometimes lives in a different place/repo than the package's source code.
Imho RTD could just stick with the environment.yml
file. It would be nice to be able to let the user specify the actual name of the file, for example if the requirements for building the docs are different than say testing, production.
This has now been deployed. We are supporting the environment.yml
file for now.
You can see more in the docs here: http://docs.readthedocs.org/en/latest/conda.html -- It would be great to have folks test this out. I ran into some issues during development with python version mismatches and some of the other interesting parts of conda. Let me know if it isn't working for folks so we can make it work better.
Excellent, many thanks!
I tried running it on one of my repos (develop branch) but I'm afraid it failed, see https://readthedocs.org/projects/floodestimation/builds/3601598/ .
Not sure If I need to do anything to trigger a conda build. I just supplied the readthedocs.yml
(https://github.com/OpenHydrology/floodestimation/blob/develop/readthedocs.yml). I've still ticked the option to install the package itself with setup.py
.
I'm asuming I should see some conda build steps in the output log if it's recognising the conda.file
key.?
@faph the problem is that you only specified the openhydrology channel (apart from the default channel). None of these channels seem to contain the sqlalchemy you requested.
I just ran conda search sqlalchemy
and it shows that 0.9.* only supports Python versions up to 3.4, so if the python3 running on readthedocs is Python 3.5, it won't find the default sqlalchemy package.
This means you'll either have to pin your environment to python ==3.4 or build your own conda package for whatever version of sqlalchemy you intend to use and upload it to your openhydrology channel or lastly add the channel url to your channels list of someone who has the package you're after (look here to find an appropriate one)
Scratch that, I misread where the error occurred. Given where it seems to call your code:
"/home/docs/checkouts/readthedocs.org/user_builds/floodestimation/envs/develop/lib/python3.4/site-packages/floodestimation-0.7.1+3.g05e75eb-py3.4.egg/floodestimation/db.py"
I think perhaps your environment is not activated ? Because your environment file specifies the name env
and here it's running from develop
.
Upon closer inspection, none of the steps mention creating your conda environment, so I wouldn't know how your dependencies could be available in the develop
environment.
Also is the pip install task supposed to install these packages in the user's environment? Because then I hope that doesn't cause issues if any of the dependencies happen to clash. But I must admit that I don't fully understand how RTD builds the docs.
Ok, I got a bit further this morning. Latest build (still failing) here: https://readthedocs.org/projects/floodestimation/builds/3602292/. At least it's now mentioning the conda install steps. So far it does:
develop
. Note that it installs Python 2!environment.yml
file. (This throws some warnings from the system python in /usr/local/lib/python2.7
although it completes without errors).sphinx-build
. Fails on missing sphinx
package!So it seems that some of the packages installed initially into the environment don't survive the various subsequent steps, possibly caused by a change from Python 2 to Python 3 and/or some pip installs halfway through.
Just a thought, would it be cleaner to let the user specify all dependencies, including sphinx? Since we've got the conda.file
key in the RTD yaml config, it's no bother just to create a environment-rtd.yml
file with the sphinx dependencies. This makes it explicit what's required and everything can get installed in one go with a clean conda env create
from the environment file. We could create a rtd channel on anaconda.org containing all the sphinx dependencies including sphinx extensions to make is easier for people to specify the dependencies as conda packages without having to hunt for or build them.
@ivoflipse RTD is taking care of environment naming and activating, that bit seems to work. The conda environments seems to be at /home/docs/checkouts/readthedocs.org/user_builds/{repo}/conda/{branch}
. With {branch}
being the environment name which gets subsequently used in conda's --name
argument.
I don't know where the conda root environment lives, but that should not affect the build process. Conda gets called ok.
Read the Docs Conda Support
This will add the ability to generate documentation with conda environments on Read the Docs. This is mainly useful for libraries with large C dependencies, including many packages in the Scentific Python ecosystem.
Task List
Abilities
You will be able to specify a conda environment.yml file, and Read the Docs will install these dependencies in your build environment.
Considerations
Read the Docs will keep seperate virtualenv & conda directories:
Users will be able to define a way to install packages for a project:
Read the Docs will need to change it's build code so that we don't hard-code virtualenv paths. We'll need to vary our environment creation, as well as bin path's for executables, based on the backend environment.
The other main thing is that we'll also need to install Sphinx & other build dependencies into the conda environment. We will continue to use pip for this, and it should be transparent, other than using the pip executable in the conda environment instead of the virtualenv.
It should also be noted that
miniconda
has a different install process from Python 2 and 3 -- also they recommend installing it from their bash scripts instead of pip. I hope that we will be able to use pip, as it will simplify our installation, and won't require an update to a bash script on version upgrades. We will have to see if we hit issues in testing.Cleanup
Read the Docs will manage conda environment deletion on the removal of a project or version.
Documentation
We will need to add information about conda support to our documentation. We might want to add a topic guide around installing requirements, along with adding a specific reference for how to use & enable conda support.
Sponsorship
This work is being funded by Clinical Graphics -- many thanks for their support of Open Source.