Structure and intent of this repo

psychemedia commented 4 years ago

This repo was originally created as a scratchpad for developing the Open Computing Lab idea by imagining what a guide for such a thing might look like and then iterating the idea and the guide in tandem with each other.

Already, I've started trying to produce generic resources that could be pulled into different modules. For example, https://github.com/OpenComputingLab/userguide is a guide for users getting started howsoever (MyBinder, Docker etc) that I'm already pulling into two repos defining OCL environments intended for use in 20J: TM129 and TM351.

What I'm imagining now for this innovationOUtside/Open_Computing_Lab_Guide repo is a top level thinkspace for developing the idea still. As things become more concrete as identified components, they'll move into repos in the OpenComputingLab Github org.

So at some point hopefully quite soon (I just need to check some diffs!), some of the docs in this repo will disappear and be replaced by the OpenComputingLab/userguide submodule, for example.

psychemedia commented 4 years ago

Re: the userguide, my original thinking was to try to produce some generic OCL materials that all modules using the OCL approach could draw on. But it's maybe also worth considering the possibility of the userguide being a template repo, and having templated items (eg {MODULECODE}), and actually rendering customised paths etc into the userguide docs for each module.

A downside of the template approach is that things could start to drift across different modules if tweaks / updates are made?

At the moment, I am using an exemplar, but real, module code in the "common" userguide. I wonder if I should make up a module code that doesn't exist for the common materials, and maybe also generate fake examples of stuff using that course code? This could then also work as a training example for folk wanting to try OCL out with a working example, albeit not one for a real course? So XYZ987 maybe?

mmh352 commented 4 years ago

My view would be that perhaps the two aspects should be split into two repositories.

One that has the "shared" userguide that focuses on the really generic elements. This would include some very concrete stuff, like installing Docker, and then more "theoretical" stuff, explaining how the various technologies used in the OCL work (Docker, Jupyter, MyBinder, Repo2Docker, ...).

The second repository would then contain templates to be used by the various modules specific instructions. Yes, the template approach could lead to some drift, but it also means that the specific instructions will (almost) always work for that specific module. Otherwise, with shared documents, for any change in those documents, all modules would have to be tested. Also the modules will need specific instructions in any case and then those might conflict with changes in the shared documents. Where appropriate the templates would refer to the generic userguide, so the templates themselves would be relatively short and focused almost exclusively on concrete instructions.

psychemedia commented 4 years ago

Right. So the shared userguide (generic stuff) is at https://github.com/OpenComputingLab/userguide .

I could start to ponder what a "cookie cutter" template repo for a new module would be like once I've settled down what I need for TM351, TM129, and TM112.

mmh352 commented 4 years ago

I wonder whether we will need at least two template repos? One for people using Jupyter Notebooks, one for people using just Docker, and perhaps some more then?

psychemedia commented 4 years ago

@mmh352 At the moment, I'm already way of creating yet more repos to lose stuff in!

If folk want to use arbitrary apps but not the notebook server, that's fine. ButI'm also advocating adding a notebook server to every container for a various of reasons, for example:

1) the container becomes trivially deployable via JupyterHub/BinderHub, which simplifies multi-user and on-demand delivery; 2) apps can be accessed via jupyter-server-proxy via a single port, down a nicely named path; the server provides a token/password challenge authentication if required; 3) even if educators don't want to use noteobooks, students might find them convenient as a place to make notes. (I want to find ways of encouraging folk to make notes, particularly if we can find ways of helping them link notes to particular app settings / views / configs etc.)

Wrapping an app as installable in the Jupyter context using a simple jupyter-server-proxy is not too much overhead (there's even a cookiecutter, though I haven't tried it yet) and it makes the app relatively portable across Jupyter server environments of whatever flavour.

mmh352 commented 4 years ago

That's a fair point :-). The only thing I'm slightly concerned with that is the size of the containers for downloading, but then everything has a down-side.

Are you advocating using JupyterLab across all containers as well then?

psychemedia commented 4 years ago

@mmh352 I'm not sure how far you can tune what Jupyter package installs actually install, or how decomposed the component packages are. Eg I wonder if pip install notebook is good enough in a clean Python environment to get a basic notebook server running?

Also, in repo2docker, it does add a lot of stuff but the intention is to make something generally useful that is likely to "just work" for a lot of sinple use case. By default, I think the base containers do all contain JupyterLab (I'm not sure about how to install just notebook and not JupyterLab; maybe just pip install notebook.)

I'm not sure how much weight JupyterLab adds to the build.

I'll start another issue on managing image sizes: https://github.com/innovationOUtside/Open_Computing_Lab_Guide/issues/3

mmh352 commented 4 years ago

It is a hefty weight, but I've been playing with it a bit now, and it will simplify a lot of the systems development, so definitely worth it. A definite convert to JupyterLab + jupyter-proxy-server here. :-)

psychemedia commented 4 years ago

It's worth bearing in mind what does what:

Jupyter notebook single user server does notebooks;
JupyterLab is a the IDE and is an alternative but not required UI; I rarely use it because it is too complex, though there are simplifications possible eg yuvipanda/simplest-notebook; I'm not sure if the simplest-notebook reduces weight at all; there also needs to be a simpler way of creating custom JupyterLab IDE setups/configurations!
jupyter-server-proxy extends the basic single user notebook server to allow the proxying of additional services.

There is another, community developed proxy, ideonate/jhsingle-native-proxy which reemoves the need for the Jupyter server proxy and wraps your application with a shim that makes it look like there is a notebook server there so that a JupyterHub server can manage the container "as if" it were running a notebook server. I didn't have much success with an early attempt at proxifying OpenRefine with it but it may have moved on, whether in code, example or documentation support, since I last tried, but this represents another option if you really do just want to ship a standalone application container running just that application and no notebook server.

mmh352 commented 4 years ago

I'll respond to some of the comments in the size discussion here.

The UI I'm building for the students contains the following three elements:

A viewer for the tutorial
A web-based code editor (primarily for HTML/CSS/JS/PHP)
A viewer for the web-site generated by the code edited in the editor

I was planning to do more heavy-lifting on this myself (code-wise), but I've now decided to let JupyterLab handle most of the elements and my code just provides a bit of glue. Currently it basically works using the following technologies:

Sphinx to build the actual tutorial
Lightttp to serve the tutorial content and the website via jupyter-server-proxy
Theia as the editor, again via jupyter-server-proxy

I've got the first two working, the third is next on the list.

I'm using a custom Dockerfile, but building through repo2docker, so in theory the repository should then also run on MyBinder/JupyterHub ... . I'm using my own Dockerfile for the following reasons:

It allows me to build a smaller image.
It allows me to integrate various additions where they are appropriate, rather than creating one massive postBuild script.
It allows for faster builds, as my Dockerfile uses staged builds, so doesn't have to build everything from scratch every time.
Everything is explicit, no "magic" happening in the background.

psychemedia commented 4 years ago

@mmh352 I'd be keen to see your build, and then perhaps do a "simple Binderised version" of it to see how the container sizes compare (this could be really interesting...).

FWIW, Theia features in the jupyter-server-proxy docs.

psychemedia commented 4 years ago

@mmh352 How are you authoring the tutorial? I tend to do most of my writing in markdown now, either using a markdown editor, or using a notebook editor via jupytext, then rendering to HTML from markdown (perhaps via jupytext if I need executed code cell outputs in the output HTML using nbsphinx or Jupyter Book (I tend to try to automate the rendering step using a Github Action).

psychemedia commented 4 years ago

@mmh352 A couple of things:

if you are using JupyterLab, then JupyterLab workspaces might be worth exploring; here's an issue relating to JupyterLab workspaces — https://github.com/innovationOUtside/Open_Computing_Lab_Guide/issues/5 — for any such related items;
if you turn up anything handy getting things like Theia to work, here's a related discussion issue on jupyter-server-proxyfying arbitrary applications: https://github.com/innovationOUtside/Open_Computing_Lab_Guide/issues/4

mmh352 commented 4 years ago

I'm authoring the tutorial using Sphinx with ReST as my primary text format. I've got a few extensions that allow me to create content that mimics what is currently available in the VLE (videos, activities with hidden answers, ...).

I'm using the JupyterLab workspaces to pre-load the default workspace layout that I want. The documentation on the actual workspace json format is not the greatest, but to get the basics working was not that hard.

Because I'm building through repo2docker, in theory it should run on Binder as well. The problem there is that I pull in the tutorial content itself via git+ssh and unfortunately outgoing ssh connections are disabled on MyBinder.

That is one thing I haven't found a fully satisfactory solution to. I don't really want to make the module tutorial content publicly accessible (at least not easily), but I still want to be able to load it and updated it from within the container. I also don't want to have the students have to load/update the tutorial content, but want it to happen automatically. Any suggestions?

psychemedia commented 4 years ago

@mmh352 Re: pulling from a private repo, I have started another thread for this: https://github.com/innovationOUtside/Open_Computing_Lab_Guide/issues/7

innovationOUtside / Open_Computing_Lab_Guide

Structure and intent of this repo #2