Closed mrocklin closed 8 years ago
I think it would be a good idea to have an RTD conda organization, which hosts all the packages for all supported platforms. That way you know it should always work for everyone with minimal effort.
For debugging how everything is configured, you should try calling conda info -a
which will show how everything is configured. Runing conda list develop
will show you want is currently present in your environment. This would make it easier to debug whether everything is present or not.
But I think the culprit is that you forgot to call source activate develop
before calling sphinx-build. I reckon that since it's in your environment's bin folder, you should be able to simply call:
source activate develop
sphinx-build -T -E -b readthedocs -d _build/doctrees-readthedocs -D language=en . _build/html
and have it build the docs.
Not sure who you mean with "you". I don't need to/cannot activate an environment, RTD should take care of that. You're probably right though that the command:
python /home/docs/checkouts/readthedocs.org/user_builds/floodestimation/conda/develop/bin/sphinx-build ...
calls the wrong Python. By the way, should they not drop python
from the command altogether because sphinx-build
is the executable itself? I'm not too familiar with Python entry points on Linux... If you call the entry point with the full path you probably don't need to activate the environment at all.
@faph Ah -- I hit this issue in dev. Since you're specifying the Python version in your environment but not the yml, we're installing "python=2" with conda -- then installing Sphinx with pip, and then re-installing python 3.4 from your environment.yml -- this means that python environment doesn't have the dependencies that we installed in it.
I'm not sure the best way to work around this, other than putting this in your YAML config:
conda:
file: environment.yml
python:
version: 3.4
Which will install the proper version of Python the first time, I believe. I also hit the issue with the project not being installed properly. This file got the build working:
conda:
file: environment.yml
python:
version: 3.4
setup_py_install: true
Perfect! Many thanks - this works. Very pleased to be able to drop my numpy
and scipy
mocking!
I like the RTD yaml config file by the way.
I'm not so sure about the pre-installed packages though, whether that's the cleanest. Would it be worthwhile adding an option to the yaml config file to allow a "bare" environment, purely from the specified environment.yml
file?
Awesome -- We're excited about the YAML file as well :)
I think it makes sense to be able to run with a clean environment and just support the environment defined in the file. From what I can tell, you can't create a conda environment with an environment file with conda create
, Perhaps we could do conda env create -n <version> -f environment.yml
if you enable the bare option, instead of doing a conda create
.
To be honest, I find all the CLI interfaces to conda a bit confusing :)
The only annoying thing with conda is the difference between plain conda
and the conda env
sub-command. Apart from that, it's pretty robust.
conda env create -n <version> -f <config.conda.file>
would be awesome.
Sorry if I'm not following (getting over a cold), can one not specify the python
version in environment.yml
?
Sure you can...
name: myenv
dependencies:
- python ==3.5
@jakirkham Yes, but currently RTD pre-creates a conda environment with just python and sphinx etc. To make sure this pre-created environment has the correct version you MUST specify the python version (also) in readthedocs.yml
.
@faph, wouldn't specifying the Python version in environment.yml
make sure it is properly upgraded or is there some other magic going on behind the scenes?
@jakirkham that's what I thought. But for some reason that's not working. Not sure if that's because of RTD first installing some sphinx dependencies with pip into the environment or whether the upgrade from Python 2 to 3 is not safe. Not sure. My wish list includes having the option to fully create the exact environment in one go from your own environment.yml
without anything pre-installed by RTD.
We just hooked this up for xray, and it worked pretty much without a hitch!
On my first attempt, it still did the build with pip instead of conda. But after wiping the environment on that branch and resetting my advanced settings, the next time I did a build it used conda. I'm not quite sure what was going on there...
I had that a couple of times too. I did the same with wiping the RTD build. Does RTD cache the config or something?
I'm having trouble with a project that needs to know where a conda installed .so file lives. I have two conda packages, a compiled C++ library, libhdfs3.so
, and a Python wrapper library, hdfs3
. The Python wrapper library tries to find the location of the .so
file by finding the main conda directory and then adding /lib
. It finds the main conda directory by shelling out to conda info
and then finding the text after default environment:
(py35)mrocklin@notebook:~$ conda info
...
default environment : /home/mrocklin/Software/anaconda/envs/py35
...
path = os.path.join(conda_dir, 'lib', 'libhdfs3.so')
Oddly when I do this in the RTD environment the resulting directory path to be /usr/lib/libhdfs3.so
which makes me very confused. Where is the conda directory here? What is the result of calling conda info
on a RTD machine? Are we using the system Python or the conda Python during builds?
Hi @mrocklin, I know very little about the RTD build process so I might be saying nonsense but I don't see the activate
step in the failed build you linked. All the packages are getting installed in a conda environment called latest
but, as far as I can see, it's never activated. This might be the cause of the failure, because in that case conda info | grep 'default environment'
will return the root environment. Perhaps calling /home/docs/checkouts/readthedocs.org/user_builds/hdfs3/conda/latest/bin/python
would be safer in this case? My two cents.
By the way, instead of reading conda info
perhaps you could use the sys.prefix
variable, which should point to the right locations when activating the environment. ctypes.find_library
unfortunately won't work, see https://github.com/ioos/conda-recipes/issues/184#issuecomment-96245725 and https://github.com/ocefpaf/conda-recipes/blob/8f8c28e79a79a06ebfb98b4a3c099e92965cd595/rtree/find_libray.patch#L51
I think the conda environment's python.exe
is called directly, which is indeed slightly incorrect since any other steps involved in the activate
script are skipped (which can vary per environment).
For reference, the conda environment is named after the branch or document version and could be anything like /home/docs/checkouts/readthedocs.org/user_builds/{reponame}/conda/{latest, develop, stable, ...}
Yeah, if an environment isn't activated, it is probably using the root python
. Though, and I could be mistaken, there is no reason not to just installing everything into the root environment here.
I think conda env create
is incapable of that, actually.
I believe you are correct @Korijn. However, I think you could use conda env update
, which shouldn't have an issue.
Though, and I could be mistaken, there is no reason not to just installing everything into the root environment here. (@jakirkham)
One reason is for example Python 2 versus Python 3, i.e. you can have a standard/hard-coded root environment (managed by RTD, e.g. Python 2 + just conda) and let the user install exactly the documentation build packages in the environment you like (e.g. Python 3, sphinx, ...). So the root environment never gets touched.
I believe the approach Continuum wants to take is to isolate conda itself as much as possible into an environment that ordinary users do not touch, i.e. consider it a standalone application. This way you can guarantee that conda works without interfering/being interfered by user packages.
I believe the approach Continuum wants to take is to isolate conda itself as much as possible into an environment that ordinary users do not touch, i.e. consider it a standalone application.
That was the approach I was assuming was happening here. It tends to be the assumed default among conda users. The way that I would expect this to work is that the user supplies a conda environment for their build, this gets activated, and then RTD pushes the packages it needs on top of this environment.
I think that is roughly happening at RTD, but only the other way around. RTD first creates a separate environment and install sphinx into it. Then it updates that environment with the user's supplied environment.yml
.
Although there is no trace of conda's activate
command in the build log, I assume the correct Python is on the PATH because docs builds do work (RTD calls python /full/path/to/env/bin/sphinx-build
). Presumable conda's activate also puts the bin
dir on the PATH; if that were to be used RTD could call sphinx-build
directly?
Seems like there's a lot of back and forth here, but no real consensus on what a proper solution is. Can anyone outline what the best course of action is for properly supporting these use cases? What we have now seems to work in the normal case, but there are edge cases where it isn't working, is my reading?
Here is my real question. Is there a reason for installing into the root environment or is this just happening? If the former, what is the reason? If the latter, it is best to change the current behavior to create a clean new non-default environment as @mrocklin has said.
I'm pretty sure RTD is not installing into the root conda environment. However, it doesn't activate a new environment either. Instead, it simply uses the binaries in the new build specific environment that it created directly. In my mind, this is a pretty reasonable solution -- activating an environment really only makes sense in an interactive session or shell script.
On Tue, Jan 12, 2016 at 11:26 AM, jakirkham notifications@github.com wrote:
Here is my real question. Is there a reason for installing into the root environment or is this just happening? If the former, what is the reason? If the latter, it is best to change the current behavior to create a clean new non-default environment as @mrocklin has said.
Reply to this email directly or view it on GitHub: https://github.com/rtfd/readthedocs.org/issues/857#issuecomment-171024692
Hrm, I think of conda environments as being particularly useful in these cases, where you want a reliable software environment for build purposes. Most build services within Continuum happen within a conda environment for predictability's sake.
@ericholscher, is there code somewhere we can look at for this? I think it would answer a lot of questions. Sorry if I missed the link somewhere.
As I said, it is being done in a fresh conda environment, the environment just isn't being activated -- the binaries in the environment are being called directly.
On Tue, Jan 12, 2016 at 11:34 AM, Matthew Rocklin notifications@github.com wrote:
Hrm, I think of conda environments as being particularly useful in these cases, where you want a reliable software environment for build purposes. Most build services within Continuum happen within a conda environment for predictability's sake.
Reply to this email directly or view it on GitHub: https://github.com/rtfd/readthedocs.org/issues/857#issuecomment-171027183
Are there particular issues you foresee occuring if you activate the environment?
In this case the issue is that my library actively depends on the conda state in order to locate a shared object file. Arguably this is a less-than-ideal way to find a shared library, but it's not the ugliest thing that people are going to try with rtd+conda.
FWIW, I've also worked around this within my own library (I now allow the library to import if it can't find the .so file) so my immediate use case is gone. Still though, I think that fewer future corner cases will occur among conda users if the environment actually gets activated. I think that this is common case (though I have less experience here than many.)
Edit: fewer future corner cases will occur among conda users if the environment actually gets activated.
Is this being run in a docker image? I can certainly propose strategies that would ensure the environment is activated. If there is code I can PR against, I can even provide a solution, but it is hard to do in the dark.
@jakirkham here is the PR that added conda support to RTD: https://github.com/rtfd/readthedocs.org/pull/1849
Cool, thanks @shoyer.
As I said, it is being done in a fresh conda environment, the environment just isn't being activated -- the binaries in the environment are being called directly.
We have the same issue in Jupyter, if you don't activate the environment, code that rely on command line utilities being on path will not work properly leading to weird behavior. So calling directly /full/path/to/python is problematic. And "shelling out need to use sys.executable
" is not a satisfying answer as code might require non-python deps that conda can install like pandoc.
So, I think running the activate script is probably too much to hope for (maybe I'm wrong). However, I have found this normally covers it on Linux (let me know if I am missing something). If I run conda info
or pretty much any other conda
command this works the same as activate
. It can easily be added to whatever language we want. These are all environment variables.
$CONDA_DEFAULT_ENV
to the name environment we want.$CONDA_ENV_PATH
to the path of the environment, which should be <conda_installation>/envs/$CONDA_DEFAULT_ENV
.$CONDA_ENV_PATH/bin
is first in $PATH
.@ericholscher I think all that is asked for is that the environment created gets added to PATH, including the standard sub-dirs like bin
, lib
etc. In conda-land there is an activate
script for that. To satisfy the crowds you may want to document the exact install steps including env variables. Just because now you've enabled conda, people are going to install the most exotic packages including non-Python stuff!
Honestly, I have never needed to add lib
or anything else other than bin
.
$CONDA_DEFAULT_ENV
and $CONDA_ENV_PATH
seem to be important. The first shouldn't be surprising (need the environment name somehow). I am not sure why the second is used (maybe if you have multiple conda installs). The rest of activate
seems more about nice things for a user to have like $PS1
and such that really don't seem important here, but I could be wrong.
I forked the project to experiment, and have several thoughts:
-q
option to conda commands to remove progress bars (originally suggested to me at https://github.com/Juanlu001/fenics-recipes/issues/20#issuecomment-169640776This is the code I was about to try by the way:
https://github.com/Juanlu001/hdfs3/commit/b004f0d53c76ee0aaf0521e9fcf0b9eb5bd71cd1
Answering @mrocklin question about activate
, I've been exclusively using conda for more than a year and I don't foressee any particular issues arising. I think the best option is probably to just use it.
We aren't running conda in an interactive session, so I'm not sure how we could run activate. If there are environment variables that should be supported, we could set them, but we're running these commands independently, so the variables set in the activate script wouldn't apply.
The code is here: https://github.com/rtfd/readthedocs.org/blob/master/readthedocs/doc_builder/python_environments.py#L163
We aren't running conda in an interactive session, so I'm not sure how we could "call" activate.
Yeah, I wouldn't be surprised if this is a problem. The activate
script is a long-ish bash script that needs to be sourced. Doesn't seem like that would mesh well with what you are doing. Fortunately, it doesn't seem to matter.
If there are environment variables that should be supported, we could set them, but we're running these commands independently, so the variables set in the activate script wouldn't apply.
Right, so if we can just tack them onto whatever environment that is used when shelling out, I think this would be fine.
The code is here: https://github.com/rtfd/readthedocs.org/blob/master/readthedocs/doc_builder/python_environments.py#L163
Thanks for the link. I have been looking at the code. I had a question that I put on the merged PR.
For the record, here is a diff of environment variables when I run activate
here:
+PYTHONNOUSERSITE=1
-PATH=/home/antoine/.local/bin:/usr/local/cuda/bin:/home/antoine/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
+PATH=/home/antoine/35/bin:/home/antoine/.local/bin:/usr/local/cuda/bin:/home/antoine/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
+CONDA_ENV_PATH=/home/antoine/35
+CONDA_DEFAULT_ENV=/home/antoine/35
As you see there's not much to it. The most important is probably the appending of the conda environment's bin
directory to the PATH.
Interesting, I don't get this PYTHONNOUSERSITE
. Seems like a useful add in general. Not sure if it matters here. Which version of conda
are you on?
Oh, forget it. PYTHONNOUSERSITE is from my own activate
wrapper :-)
Ah, ok. In any event, I think we can safely ignore it here as there should be only this one Python install that we are worried about.
Should be easy enough to support the conda PATH, as we already have the bin_path
argument to our run call.
I could be wrong, but I think we want to be able to pass environment variables as a kwarg to something like this ( https://github.com/rtfd/readthedocs.org/blob/aba714e82d218d60773955aec62a3df74173348d/readthedocs/doc_builder/backends/sphinx.py#L156 ). Does run
permit that?
So, we have to change the environment of a BuildCommand
then, yes? Maybe we can just add these to the environment before the build is called?
Read the Docs Conda Support
This will add the ability to generate documentation with conda environments on Read the Docs. This is mainly useful for libraries with large C dependencies, including many packages in the Scentific Python ecosystem.
Task List
Abilities
You will be able to specify a conda environment.yml file, and Read the Docs will install these dependencies in your build environment.
Considerations
Read the Docs will keep seperate virtualenv & conda directories:
Users will be able to define a way to install packages for a project:
Read the Docs will need to change it's build code so that we don't hard-code virtualenv paths. We'll need to vary our environment creation, as well as bin path's for executables, based on the backend environment.
The other main thing is that we'll also need to install Sphinx & other build dependencies into the conda environment. We will continue to use pip for this, and it should be transparent, other than using the pip executable in the conda environment instead of the virtualenv.
It should also be noted that
miniconda
has a different install process from Python 2 and 3 -- also they recommend installing it from their bash scripts instead of pip. I hope that we will be able to use pip, as it will simplify our installation, and won't require an update to a bash script on version upgrades. We will have to see if we hit issues in testing.Cleanup
Read the Docs will manage conda environment deletion on the removal of a project or version.
Documentation
We will need to add information about conda support to our documentation. We might want to add a topic guide around installing requirements, along with adding a specific reference for how to use & enable conda support.
Sponsorship
This work is being funded by Clinical Graphics -- many thanks for their support of Open Source.