ljlamarche commented 5 years ago

Here's the version of resen-core without conda that @pmreyes2 put together. On my laptop, it creates and image that's 3.62 GB (vs 7.54 GB of our standard image). I've managed to successfully build the image and run most of our tests and tutorials on it.

Minor issues:

The Python 3 environment still shows up (the standard version's been modified so that only py27 and py36 are available)
There's something up with davitpy - it can find the radar.dat file, but it can't read it for some reason. This needs a little more investigating.
Pre-downloading index tables in pyglow when the docker file is built doesn't seem to work. The package is still installed and can be used normally, but the first time it's imported it still goes through the generating tables step.

To test:

Checkout the resen-core-no_conda branch
```
git checkout resen-core-no_conda
```

Build the image

docker build -t earthcubeingeo/resen-core-lite .

Checkout the unittests branch
```
git checkout unittests
```

Start a container and mount in the resen-core directory.

docker run -itp 8000:8000 -v /local/path/to/resen-core/resen-core:/home/jovyan/work/resen-core:rw,Z --name resen-lite_test earthcubeingeo/resen-core-lite bash

Optionally, also mount in the MemorialDayStorm data directory so all tutorials will work.

docker run -itp 8000:8000 -v /local/path/to/resen-core/resen-core:/home/jovyan/work/resen-core:rw,Z -v /local/path/to/MemorialDayStorm:/home/jovyan/mount/datasets/MemorialDayStorm:ro,Z --name resen-lite_test earthcubeingeo/resen-core-lite bash

Start jupyter lab from within container
```
jupyter lab --port 8000
```
Try running the unit tests available in resen-core/testing/tests/
Try running the tutorials available in resen-core/resources/tutorials

asreimer commented 4 years ago

I'm almost done refactoring the dockerfile. One thing to note, this:

To avoid error: OSError: [Errno 99] Cannot assign requested address

RUN /bin/bash -c 'printf "c.NotebookApp.ip = \"0.0.0.0\" \n" >> /home/$NB_USER/.jupyter/jupyter_notebook_config.py'

is not needed. We specifically pass the necessary jupyter server options when we start it: https://github.com/EarthCubeInGeo/resen/blob/master/resen/Resen.py#L562

asreimer commented 4 years ago

Ok, I think I'm done working on this now.

There were some things missing. Remember, we need/want to provide some of the same things that are available in the jupyter/scipy-notebook docker image. So, I've added some of the things that were missing. For example, we need the start-* scripts for the jupyterhub stuff to work on the server.

Also, the dockerfile has been refactored to group common or related commands together. I also added comments to help delineate between groups of commands.

An important thing to note is that we will need to change this line of code in resen because we need to:

source ~/envs/py36/bin/activate

now that we don't have conda installed. We might want to make this change smart enough to know which version of resen-core we are running so it knows how to source the environment (what if we use python 3.7 in a future release?).

When I build the dockerfile, I end up with an image that is 5.15GB in size. For reference, just the base jupyter/scipy-notebook that we were using as a starting point before is 4.66GB.

With this, I think we can move this PR to review and test it before creating a new release.

asreimer commented 4 years ago

Here's how I suggest you test this PR:

# build the new image
docker build -t resen/testing .

# create container
docker run -tdp 9000:9000 --name resen_testing resen/testing bash

# start jupyterlab
docker exec -it resen_testing bash -cl 'source ~/envs/py36/bin/activate && jupyter lab --no-browser --ip 0.0.0.0 --port 9000 --NotebookApp.token=asdfasdfasdf --KernelSpecManager.ensure_native_kernel=False'

# stop jupyterlab
docker exec -it resen_testing bash -cl '/home/jovyan/envs/py36/bin/python -c "from notebook.notebookapp import shutdown_server, list_running_servers; svrs = [x for x in list_running_servers() if x[\"port\"] == 9000]; sts = True if len(svrs) == 0 else shutdown_server(svrs[0]); print(sts) "'

# stop container
docker stop resen_testing

# remove container
docker rm resen_testing

When you have the container running, perform whatever tests you want to do. I just did some basic import checking of python packages and ran some checks that should ensure this works in the server environment.

asreimer commented 4 years ago

This is a useful reference: https://docs.docker.com/develop/develop-images/dockerfile_best-practices/

asreimer commented 4 years ago

I've updated the dockerfile so it doesn't use ENV to set any environment variables that are needed by the jovyan user. Instead, these are set in /home/jovyan/.bashrc

If we find we need some environment variables set for both root and jovyan (not sure why) we can set those using /etc/profile.d or something similiar.

asreimer commented 4 years ago

Merge conflicts are now fixed. When can we merge this? Has anyone else done any testing?

ljlamarche commented 4 years ago

I fixed the pyglow installation (we just needed to set LANG=C.UTF-8 which is now being done at the top of setup_pyglow.sh) but I had to comment out the jupyter notebook patch at the end of the Dockerfile to get it to build. However, it looks like that bugfix has been merged with master (jupyter/notebook#4674), so can we just get rid of the patch?

Also, just a stylistic thing, but because we're not using the docker ENV command to set environment variables anymore, can we move environment variable declarations that are specific to particular packages into the setup scrips for that package? i.e. instead of :

RUN /bin/bash -c 'bash setup_davitpy.sh && rm setup_davitpy.sh && \
                  echo "SD_HDWPATH=/home/$NB_USER/cache/hdw.dat" >> /home/$NB_USER/.bashrc && \
                  echo "SD_RADAR=/home/$NB_USER/cache/radar.dat" >> /home/$NB_USER/.bashrc'

Can we just add the following to the end of setup_davitpy.sh?

echo "SD_HDWPATH=/home/$NB_USER/cache/hdw.dat" >> /home/$NB_USER/.bashrc
echo "SD_RADAR=/home/$NB_USER/cache/radar.dat" >> /home/$NB_USER/.bashrc

asreimer commented 4 years ago

LANG will need to probably be added to .bashrc

The jupyter patch isn't supposed to be there either. I accidentally brought that back in after fixing merge conflicts... Have a look at the commit before the merge conflict to see what I mean.

At the DMV now...

On Wed, Oct 2, 2019, 09:05 Leslie Lamarche notifications@github.com wrote:

I fixed the pyglow installation (we just needed to set LANG=C.UTF-8 which is now being done at the top of setup_pyglow.sh) but I had to comment out the jupyter notebook patch at the end of the Dockerfile to get it to build. However, it looks like that bugfix has been merged with master ( jupyter/notebook#4674 https://github.com/jupyter/notebook/pull/4674), so can we just get rid of the patch?

Also, just a stylistic thing, but because we're not using the docker ENV command to set environment variables anymore, can we move environment variable declarations that are specific to particular packages into the setup scrips for that package? i.e. instead of :

RUN /bin/bash -c 'bash setup_davitpy.sh && rm setup_davitpy.sh && \ echo "SD_HDWPATH=/home/$NB_USER/cache/hdw.dat" >> /home/$NB_USER/.bashrc && \ echo "SD_RADAR=/home/$NB_USER/cache/radar.dat" >> /home/$NB_USER/.bashrc'

Can we just add the following to the end of setup_davitpy.sh?

echo "SD_HDWPATH=/home/$NB_USER/cache/hdw.dat" >> /home/$NB_USER/.bashrc echo "SD_RADAR=/home/$NB_USER/cache/radar.dat" >> /home/$NB_USER/.bashrc

— You are receiving this because your review was requested. Reply to this email directly, view it on GitHub https://github.com/EarthCubeInGeo/resen-core/pull/32?email_source=notifications&email_token=ABDEHE6MY3TEDXSNMTJC63TQMTBFDA5CNFSM4IVNEB52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAFJHQI#issuecomment-537564097, or mute the thread https://github.com/notifications/unsubscribe-auth/ABDEHE6P6EWFNWBV7FYM2Q3QMTBFDANCNFSM4IVNEB5Q .

ljlamarche commented 4 years ago

It seems to be working without it? I set it for the build, then started a container and checked that LANG was NOT set, but I could still import pyglow... I think it needs it to compile all the models but doesn't care later. Regardless, if it won't hurt anything else to have it set, I'll add it to .bashrc just to be on the safe side.

Ok, cool I'll try to remove it again then. Also, should be safe to remove the mangopy and spacepy helper scripts now, correct? I'm trying to tidy up things we're not using.

asreimer commented 4 years ago

Strange! Might be worth reporting to the pyglow repo?

I think it is safe to remove those 2 helper scripts as well. It's a good idea to tidy things up.

On Wed, Oct 2, 2019, 09:25 Leslie Lamarche notifications@github.com wrote:

It seems to be working without it? I set it for the build, then started a container and checked that LANG was NOT set, but I could still import pyglow... I think it needs it to compile all the models but doesn't care later. Regardless, if it won't hurt anything else to have it set, I'll add it to .bashrc just to be on the safe side.

Ok, cool I'll try to remove it again then. Also, should be safe to remove the mangopy and spacepy helper scripts now, correct? I'm trying to tidy up things we're not using.

— You are receiving this because your review was requested. Reply to this email directly, view it on GitHub https://github.com/EarthCubeInGeo/resen-core/pull/32?email_source=notifications&email_token=ABDEHE3XLQAOFYRKUPDG3ETQMTDOLA5CNFSM4IVNEB52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAFLG3Y#issuecomment-537572207, or mute the thread https://github.com/notifications/unsubscribe-auth/ABDEHE7FE4AYKAGUFZDSQQTQMTDOLANCNFSM4IVNEB5Q .

ljlamarche commented 4 years ago

Potentially. @pmreyes2 originally found this solution, so he probably has more insight on why exactly it's a problem. I don't recall having to do that when I installed pyglow on my laptop, so it might be something specific to the ubuntu docker container?

pmreyes2 commented 4 years ago

I believe the problem is with python3 numpy f2py that needs LANG to be set to UTF-8 in order to be able to read the FORTRAN text files.

ljlamarche commented 4 years ago

As far as I can tell, pretty much everything seems to be working now. I tested this by starting a container (as suggested by @asreimer above) and grabbing the GeospacePackagesInResen.ipynb tutorial notebook and just running through all the cells.

wget https://raw.githubusercontent.com/EarthCubeInGeo/resen-core/tutorials/resen-core/resources/tutorials/GeospacePackagesInResen.ipynb

The only thing giving me trouble is davitpy. That might have to be it's own PR - it's not an error I'm familiar with.

asreimer commented 4 years ago

Alright. I think we can just merge this in now. The server is currently undergoing a transition, so I'm not able to test anything because I can't log in.

Any modifications needed for the server can easily be handled in another PR.

asreimer commented 4 years ago

Oh, just for reference, the final image size as of this PR being merged is 4.59GB

ljlamarche commented 4 years ago

Good job, this looks great! I'm going to close a bunch of the issues that this PR solves.

EarthCubeInGeo / resen-core

Resen core lite #32

To avoid error: OSError: [Errno 99] Cannot assign requested address