jupyterhub / mybinder.org-user-guide

Turn a Git repo into a collection of interactive notebooks. This is Binder's user documentation repository.
https://mybinder.readthedocs.io
BSD 3-Clause "New" or "Revised" License
152 stars 103 forks source link

ResolvePackageNotFound: requests=2.14 #166

Closed shankari closed 5 years ago

shankari commented 5 years ago

I am not able to deploy the environment from my repo at https://github.com/e-mission/e-mission-eval-public-data

with the error [1]

Screen Shot 2019-07-27 at 9 55 41 AM

[1] Sorry for using a screenshot but I can't copy/paste the build logs

However:

Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 4.5.4
  latest version: 4.7.10

Please update conda by running

    $ conda update -n base conda

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Collecting osmapi==1.2.* (from -r /home/analyst/e-mission-eval-public-data/condaenv.yv6jvp2z.requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/bd/5c/922862500070038783b62c7a81b3262fd0779a5a0ec1a06556fafa8be2a9/osmapi-1.2.2.tar.gz
Collecting polyline==1.3.* (from -r /home/analyst/e-mission-eval-public-data/condaenv.yv6jvp2z.requirements.txt (line 2))
  Downloading https://files.pythonhosted.org/packages/c1/d0/58a19ca3fbe880145d200518fcd97d176cae07b9677db330f4881954d5f5/polyline-1.3.2-py2.py3-none-any.whl
Requirement already satisfied: requests in /home/shankari/miniconda3/envs/emissioneval/lib/python3.6/site-packages (from osmapi==1.2.*->-r /home/analyst/e-mission-eval-public-data/condaenv.yv6jvp2z.requirements.txt (line 1))
Requirement already satisfied: six>=1.8.0 in /home/shankari/miniconda3/envs/emissioneval/lib/python3.6/site-packages (from polyline==1.3.*->-r /home/analyst/e-mission-eval-public-data/condaenv.yv6jvp2z.requirements.txt (line 2))
Building wheels for collected packages: osmapi
  Running setup.py bdist_wheel for osmapi ... done
  Stored in directory: /home/shankari/.cache/pip/wheels/a2/c1/32/07b60a6079091e5031cb33c9ddf553efbe50471945b09d0976
Successfully built osmapi
Installing collected packages: osmapi, polyline
Successfully installed osmapi-1.2.2 polyline-1.3.2
#
# To activate this environment, use
#
#     $ conda activate emissioneval
#
# To deactivate an active environment, use
#
#     $ conda deactivate

as we can verify has a compatible version of requests

(emissioneval) $ lsb_release  -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.4 LTS
Release:        16.04
Codename:       xenial

(emissioneval) $ conda list | grep requests
requests                  2.14.2                   py36_0
choldgraf commented 5 years ago

Have you created a new conda environment from scratch using that environment.yml file on ubuntu, and it still worked? It could be that one of the packages is installing a version that's incompatible with another of the packages.

shankari commented 5 years ago

@choldgraf yup.

Let me delete the environment and restart.

There is no emeval environment on my ubuntu host

$ conda env list
# conda environments:
#
base                  *  /.../miniconda3
emission                 /.../miniconda3/envs/emission
py27                     /.../miniconda3/envs/py27

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.4 LTS
Release:        16.04
Codename:       xenial

Now, I install it. I make sure to use the command line manually instead of the wrapper script `setup.sh

$ conda env update --name emissioneval --file environment.yml
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 4.5.4
  latest version: 4.7.10

Please update conda by running

    $ conda update -n base conda

Downloading and Extracting Packages
pyrsistent-0.15.4    |   89 KB | ############################################### | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: / Enabling notebook extension jupyter-js-widgets/extension...
      - Validating: OK
                                                                                      done
Collecting osmapi==1.2.* (from -r /home/analyst/e-mission-eval-public-data/condaenv.0qcv9tkl.requirements.txt (line 1))
Collecting polyline==1.3.* (from -r /home/analyst/e-mission-eval-public-data/condaenv.0qcv9tkl.requirements.txt (line 2))
  Using cached https://files.pythonhosted.org/packages/c1/d0/58a19ca3fbe880145d200518fcd97d176cae07b9677db330f4881954d5f5/polyline-1.3.2-py2.py3-none-any.whl
Requirement already satisfied: requests in /.../miniconda3/envs/emissioneval/lib/python3.6/site-packages (from osmapi==1.2.*->-r /home/analyst/e-mission-eval-public-data/condaenv.0qcv9tkl.requirements.txt (line 1))
Requirement already satisfied: six>=1.8.0 in /.../miniconda3/envs/emissioneval/lib/python3.6/site-packages (from polyline==1.3.*->-r /.../e-mission-eval-public-data/condaenv.0qcv9tkl.requirements.txt (line 2))
Installing collected packages: osmapi, polyline
Successfully installed osmapi-1.2.2 polyline-1.3.2
#
# To activate this environment, use
#
#     $ conda activate emissioneval
#

Now we have the environment

$ conda env list
# conda environments:
#
base                  *  /.../miniconda3
emission                 /.../miniconda3/envs/emission
emissioneval             /.../miniconda3/envs/emissioneval
py27                     /.../miniconda3/envs/py27

and when I activate it, the version of requests does match 2.14

$ source activate emissioneval
(emissioneval) $ conda list | grep requests
requests                  2.14.2                   py36_0
betatim commented 5 years ago

Can you try updating your version of conda to 4.7? This is what we use in repo2docker. I think conda 4.7 got faster as well as stricter on resolving dependencies.

shankari commented 5 years ago

I can reproduce the error after updating the version.

Is the change to 4.7 recent? Because this appears to be a regression - everything did work ~ 1-2 months ago.

If this is indeed a recent change, it would be helpful to have a way to monitor changes/proposed changes to the environment so that we don't end up having our binder bitrot without any notice.

Maybe formal releases that we can subscribe to would be a good idea?

choldgraf commented 5 years ago

Does the same error happen if you explicitly pin all of your dependencies w/ no wild-cards? We've found that having some dependency versions pinned while others are not pinned often results in weird things happening.

(also just a note from your readme: the reason that launches are often slow is because the first time your environment is launched on a node, a Docker Pull has to happen, which can take a while. After that it is faster if you land on the same node gain. We're hoping to find ways to surface this information better)

shankari commented 5 years ago

@choldgraf yup. I originally had all the versions explicitly pinned down to the patch level and everything was working. Then, when I tested the binder configuration again recently, it gave this error with resolving the requests package.

So I added wildcards (https://github.com/e-mission/e-mission-eval-public-data/commit/d1542c81d0c886d84744d6d3734c27eae5d2420b) just ~ 5 days ago.

For now, I plan to workaround by just bumping up requests to 2.22.* which is the version that I get with conda 4.7.

But I'm a little concerned that everything could break again if/when you upgrade to a new conda release. And then people trying to use the repo will give up because it is broken.

I've been checking in notebooks without embedded results, reasoning that people can use binder to get the results. Not sure if I need to rethink that.

shankari commented 5 years ago

(also just a note from your readme: the reason that launches are often slow is because the first time your environment is launched on a node, a Docker Pull has to happen, which can take a while. After that it is faster if you land on the same node gain. We're hoping to find ways to surface this information better)

Presumably you are also garbage collecting the environments after a while, so I would expect this to happen every time for infrequently used repos. Which is totally fine and expected, but since my work is inter-disciplinary, I just wanted to reassure my users that everything was still working.

choldgraf commented 5 years ago

re: the conda versions etc, please keep an eye on that problem and if it comes up often enough we might need to design around it in repo2docker (e.g., specify a conda version? I dunno...)

re: the image pulls, there is some garbage collecting that happens, though not a ton. Hopefully we'll be able to provide more reassurance to folks through better logging and messaging

betatim commented 5 years ago

repo2docker (the tool used to build images from repositories) is released roughly every three months, but the version on mybinder.org changes every few days or weeks (we run master).

The best option is to run repo2docker . as part of your CI or if that doesn't run very often as a cron job on travis. That way you will know if the repo would still build today or not.

My hunch is that conda tends to get better (and in this release stricter) over time. Without looking into your particular set of packages and versions this means that it was "lucky" that things resolved with an earlier version.

The image cache we have on mybinder.org is just that: a cache. We will (and do) remove stuff from it as it quickly grows to tens of terrabytes if left unchecked. This is inconvenient but the best compromise given we don't have an infinite amount of money :)

There are proposals, discussions and proof-of-concepts for letting repositories pin the version of repo2docker that they want to be built with. However no one has yet invested the effort to push it over the line. As always contributions are very welcome (and needed).

My summary after watching many repositories for two or so years now: keeping your repo running requires constant effort. We can do several things to mitigate how quickly it bit rots but the only thing that would remove the need for constant maintenance is finding a way to stop the universe from changing. This is why I recommend running repo2docker master once a week in your CI to get notified early when things break, because they will.


We have a few repositories in our CI that we build with repo2docker. I think the longest time between having to make changes to a repo because it started breaking "over night" is something like twelve months. The most recent one was a repo installing pinned versions of python 2, numpy and matplotlib. Nothing else. It broke because there was a new release of numpy. How? Because the old pinned version of numpy had a "broken" build setup. The point being: even super simple repos break for no good reason :-(

shankari commented 5 years ago

@betatim thanks for the pointer to repo2docker, if/when I set up CI for this project, I will definitely look into it.

The image cache we have on mybinder.org is just that: a cache. We will (and do) remove stuff from it as it quickly grows to tens of terrabytes if left unchecked. This is inconvenient but the best compromise given we don't have an infinite amount of money :)

Just in case I was not clear earlier, I think this makes perfect sense. You guys are offering a great service to the community and I am grateful to you for all your work. I was just trying to reduce the number of "the project is not working" issues filed against my project by giving people an idea of what to expect.

Again, thanks so much all you do!

choldgraf commented 5 years ago

thanks @shankari for being a helpful issue-filer :-)

betatim commented 5 years ago

No worries. I find it helpful to explain why we do what we do because some of it can seem quite draconian or arbitrary without the reasoning. Also, when new people think about an argument they find new solutions (within the existing constraints) 😀

shankari commented 5 years ago

@betatim I just ran into a couple of other issues with checked in notebooks (scipy dependency, line I forgot to comment out...) and as it looks like others may want to contribute too for at least some time. So I would like to turn on CI.

I found this script for running all the notebooks in a particular directory, so I could just have travis install the conda environment and run the notebooks. However, I would also like to combine that with repo2docker to catch regressions like the one in this bug.

Is there a concrete example of using repo2docker in a CI build that I can use as a template? If not, I am happy to contribute one, but want to avoid unnecessary work 😎

choldgraf commented 5 years ago

Hmmm - this is the only example that I know of that covers continuous integration etc https://github.com/binder-examples/continuous-build maybe that is a start in the right direction?

betatim commented 5 years ago

I have used repo2docker . papermill somenotebook.ipynb to run a single notebook in a CI command. To run several I'd add a script called verify to the repo that runs the notebooks (or does what ever needs doing to verify the image works) and then use repo2docker . ./verify to execute it.

I think the continuous-build example repo is wwwaaayyy too complex :-/

choldgraf commented 5 years ago

@shankari I think this is all a way of saying: yes, if you wanna show an example of using repo2docker in order to verify build environments as part of CI, I think it'd be awesome :-)

betatim commented 5 years ago

Yes please! to what Chris said :)

Maybe we can make a new example repository that demos what ever you come up with @shankari? That would be awesome.