NSLS-II / scipy-binder

Binder-compatible repo including scipy stack and databroker
3 stars 14 forks source link

Address bit rot #46

Closed danielballan closed 2 years ago

danielballan commented 2 years ago

[Edited]

Summary of the below:

danielballan commented 2 years ago
  Step 41/54 : RUN TIMEFORMAT='time: %3R' bash -c 'time ${MAMBA_EXE} env update -p ${NB_PYTHON_PREFIX} --file "binder/environment.yml" && time ${MAMBA_EXE} clean --all -f -y && ${MAMBA_EXE} list -p ${NB_PYTHON_PREFIX} '
   ---> Running in 2b3cc43312db
  bash: line 1:     8 Killed                  ${MAMBA_EXE} env update -p ${NB_PYTHON_PREFIX} --file "binder/environment.yml"
  time: 4132.264
  Removing intermediate container 2b3cc43312db
  The command '/bin/sh -c TIMEFORMAT='time: %3R' bash -c 'time ${MAMBA_EXE} env update -p ${NB_PYTHON_PREFIX} --file "binder/environment.yml" && time ${MAMBA_EXE} clean --all -f -y && ${MAMBA_EXE} list -p ${NB_PYTHON_PREFIX} '' returned a non-zero code: 137
danielballan commented 2 years ago

I confirmed locally with

jupyter-repo2docker .

that the same step hangs for at least one hour:

Step 41/51 : RUN TIMEFORMAT='time: %3R' bash -c 'time mamba env update -p ${NB_PYTHON_PREFIX} -f "binder/environment.yml" && time mamba clean --all -f -y && mamba list -p ${NB_PYTHON_PREFIX} '
 ---> Running in 2da183238069

I suspect GH Actions is killing it for lack of any output. It may be killing it for memory usage (OOM) but it's hard to imagine how mamba could use up all 7 GB of RAM provided to GH Action runners.

danielballan commented 2 years ago

The above was running on a very old version of jupyter-repo2docker, one that pre-dates their switch from conda to micromamba. But now with the latest jupyter-repo2docker (2022.02.0) I still get stuck in the same place. I let it sit for over an hours before killing it:

Step 41/54 : RUN TIMEFORMAT='time: %3R' bash -c 'time ${MAMBA_EXE} env update -p ${NB_PYTHON_PREFIX} --file "binder/environment.yml" && time ${MAMBA_EXE} clean --all -f -y && ${MAMBA_EXE} list -p ${NB_PYTHON_PREFIX} '
 ---> Running in cff2fa5d2444
danielballan commented 2 years ago

Creating the same environment locally, without repo2docker:

$ time micromamba create -f binder/environment.yml  -p from_file --dry-run

finishes in about a minute. I suspect the problem is the two-step creation of the environment --- i.e. it's created with the base requirements for JupyterHub and then updated with environment.yml.

I can extract a Dockerfile like this:

$ jupyter-repo2docker --debug --no-build . > Dockerfile

but it expects a certain execution environment (e.g. supporting files) from repo2docker...this leads me to conclude that path would be fiddly and error-prone.

danielballan commented 2 years ago

Next idea: trying to remove some of the pins from Jupyter-related projects to see if those are creating bad interactions with the base/default environment.

danielballan commented 2 years ago

Update: Nope.

danielballan commented 2 years ago

My current approach is to run

jupyter-repo2docker .

in the repository root, locally, and try commenting out various lines of environment.yml to see which lines (or combinations of lines) create the hang.

danielballan commented 2 years ago

OK, locally https://github.com/NSLS-II/scipy-binder/pull/46/commits/aa3d624053f9692809def31f1851203203f08cfb works.

The only change to I have omitted the python=3.9 pin, so we are getting repo2docker's default of Python 3.7.

It seems that the Python 3.9 pin is incompatible with something. Next step should be to re-pin that and go back through the processing of checking chunks of the requirements to narrow where the issue is.

danielballan commented 2 years ago

A snap shot:

aa3d624 worked in one of the two builds, but it took a very long time.

image

danielballan commented 2 years ago
Encountered problems while solving:
  - package edrixs-0.0.4-py36hc8ef283_0 requires python >=3.6,<3.7.0a0, but none of the providers can be installed
danielballan commented 2 years ago

For 7241050, the Mamba-reported time (includes solve + download + install) is time: 435.096.

danielballan commented 2 years ago

For 06594dd, the Mamba-reporeted time is time: 666.575.

danielballan commented 2 years ago

The push succeeded: https://hub.docker.com/r/nsls2/scipy-binder/tags