Reproducible-Science-Curriculum / RR-Jupyter-Hackathon-Jan-2017

Curriculum Development Hackathon on Reproducible Research using Jupyter Notebooks, to be held Jan 9-11 at BIDS in Berkeley, CA
Creative Commons Zero v1.0 Universal
24 stars 3 forks source link

How out of the box do we want to be when it comes to installation suggestions? #13

Open elliewix opened 7 years ago

elliewix commented 7 years ago

There have been a lot of good suggestions for varying ways to install jupyter, get packages going, make suggestions, etc. Some of these range from pretty standard methods of install to using very specific and advanced services.

I have several fears with incorporating these advanced services (e.g. the variety of docker(s) and other newer projects) into our lessons. They mostly center around running the risk of making the lessons stale and needing regular updates to keep up with those services. The more specific we are about what to install, the more we make the lessons dependent on those a) existing, b) staying roughly the same for install and use, and c) staying the community standard.

My suggestion here is to think of the lessons we write as speaking toward the most general concept of the jupyter notebook (as much as possible, of course there are going to be times when we'll need to dig into specifics), but this should allow anyone to use whichever jupyter tool they like today but even future platforms that don't exist yet.

A more tangible example of this might be the persistent issues of academic computer labs. There is a campus level policy that bans us from installing anything like open refine or jupyter notebooks on our lab computers, but we could get away with having them install PyCharm or another piece of standalone software that uses it outside of a web browser. Another campus might want to use these lessons with their jupyter lab/hub/etc.

choldgraf commented 7 years ago

w/r/t docker etc, I think that it is worth mentioning these kinds of things because (IMO) they are the future of reproducible science and will be part of the backbone of a lot of services built to accomplish this. In terms of putting together material, I think it may be too early to make strong dependencies because the technology seems to be evolving rapidly. Maybe (hopefully) that'll be different in a year or three :)

mpacer commented 7 years ago

Related to this is the question of using conda or a pip based system, possibly with environments managed by virtualenv or python -m venv. The former is closer to a one click solution, while the latter is closer to a bare bones python way of solving the problem.

Also, some people don't like that conda is associated and largely directed by a for-profit company (Continuum Analytics) while others point out that conda itself is open source and developed on GitHub with community engagement. It's worth noting that some Jupyter docs with installation examples are written with conda in mind for virtual environment management, even when the examples explicitly invoke pip install.