Open psychemedia opened 4 years ago
The official Jupyter Docker stack base-notebook
(docs) is the smallest official image providing a running notebook server.
The minimal-notebook
(docs) adds in Latex for generating PDFs. PDFs can also be generated via a non-Latex route using Chromium (betatim/notebook-as-pdf
).
I put out a query regarding the smallest possible Binder container and had minrk/smallest-binder
back as a suggestion. That repo includes useful discussion in the README and some example branches.
If building up from an official Python container, python:3.7-slim
seems to be the most efficient route in general; python:3.7-alpine
may not support all the packages a distribution might need?
In terms of pulling Python from a package manager, see this discussion on using miniforge
to get python.
I've been building a container for tt284 and initially I built it off the base images repo2docker uses (about 1.5GB). Then I used a Dockerfile directly to build off python:3.8-slim and that saved about .4GB of space (1.1GB). JupyterLab and the Node stuff just take up a lot of space, little that can be done about it, it seems. That is almost 1/3 space saving, but at the cost of having to build everything from scratch.
While there are no discussion forms, perhaps you could add some labels, that would distinguish bugs and discussion and so on
repo2docker
can build from a Dockerfile
(eg Use a Dockerfile for your Binder repository) so we could certainly create a minimal viable Docker container that runs either standalone, or via MyBinder/BunderHub, or via JupyterHub etc.
Current examples of include innovationOUtside/OUbrandednotebook
uses a Dockerfile that builds on a Jupyter base container but could be built on something more minimal, and ouseful-course-containers/ou-example
simply adds OU customisation to a default repo2docker
/MyBinder build.
One thing I am keen not to lose is the ability for other people to create containers as easily and straightforwardly as possible.
Running something on MyBinder from a vanilla repo means the user (another academic interested in exploring things, for example) doesn't really have to do anything other than put some content files into the repo. Adding requirements.txt
is not overly onerous although in several years of MyBinder availability I don't think I've managed to get anyone not me to try to actually use it in the OU because folk see even getting a Github account and clicking on "Create New Repository" as too hard... And for many, understanding requirements.txt
is apparently also too hard.
Other advantages of using official Jupyter layers is that they are maintained, and they work and are regularly tested against other bits of the Jupyter ecosystem. There does not appear to be a culture of even occasional, let alone regular or continuous integration and test in systems LDS "support" or "test" for module use, so any way of leveraging Jupyter standard issue containers is to our advantage.
Not locking things into particular OU base layers also means other educators might be willing to build and contribute environments.
I agree that a build without JupyterLab might be useful, but we could perhaps use postBuild
instructions to remove applications from Jupyter base containers that we want to strip out? The build time is not necessarily a consideration if we are shipping prebuilt images to students.
I'm not so sure whether the size difference is that important, so I wouldn't worry too much about it.
I agree that the bigger issue is supporting other academics in using this technology. That's a much trickier aspect. I'm not particularly convinced that we can build something interesting that allows other academics to simply drop their content into a copy of a template. I think we can build something, but it is then effectively the digital equivalent of a chalk'n'talk lecture. If you want to do something interesting with the technology, then you have to get dirty with the technology, adapt it to the necessities of the content. For that I don't think much apart from some help with the technology can be provided. Perhaps when we have a broader set of containers people are using, then we can see what can be done/provided to help others join in?
@mmh352 "I'm not so sure whether the size difference is that important, so I wouldn't worry too much about it." <- you seem to be worrying about it by wanting to build your own images from a de novo Dockerfile! ;-)
Re: making things useful to others, I think I disagree. We can have a range of base containers that folk could use as a provided off-the-shelf environment, and could perhaps even automate the construction of binder templates from form driven UI (eg select which application you want bundling into the container).
For examples around the edges:
nbgitpuller
(the idea here is you build an app/requirements image in one repo, and then use that as a base box environment into which you pull content from your simple content repo; here, academics can just write the content in their repo, and then trivially pull it in your HTML app writing environment built from another repo). See Binder Base Boxes, Several Ways… for a riff on this.:-) Well, the size conclusion came after I re-built it using a de novo Dockerfile :-D. Some things you just have to learn the hard way.
Thank you for the links. I'm basically using an approach very similar to the Binder base boxes + nbgitpuller approach.
Adding a link to demo TT284 example dockerfile config: https://github.com/mmh352/tt284-container
This is a discussion issue relating to creating efficiently sized Docker container / images for use delivering Open Computing Lab environments.
When Github makes discussion forms available it perhaps would make sense to start a thread there.