ministryofjustice / analytical-platform

Analytical Platform • This repository is defined and managed in Terraform
https://docs.analytical-platform.service.justice.gov.uk
MIT License
12 stars 4 forks source link

✨ Build JupyterLab image #4607

Open michaeljcollinsuk opened 4 months ago

michaeljcollinsuk commented 4 months ago

User Story

As an Analytical Platform engineer I want to release one single image of JupyterLab ~that covers functionality from Jupyter's all-spark and datascience~ So that users have one Jupyter to use and we have one Jupyter to support

Value / Purpose

~We currently offer two variants of Jupyter Notebook (all-spark and datascience) with varying packages layered on top, we are also running v3.x release of these which EOL'd in May 24 (https://blog.jupyter.org/jupyterlab-3-end-of-maintenance-879778927db2)~

We want to offer a minimal JupyterLab image, based on the Analytical Platform Cloud Development Environment Base Image, for users of the Analytical Platform.

Useful Contacts

@jacobwoffenden @Gary-H9

User Types

Analytical Platform engineering

Proposal

Build a JupyterLab offering from scratch like we do for Visual Studio Code, allowing us to understand and control the entire offering.

Subject to user research this image may need to combine the functionality of Jupyter's all-spark and datascience

Definition of Done

julialawrence commented 4 months ago

https://github.com/ministryofjustice/analytical-platform/issues/4165

julialawrence commented 4 months ago

Jacob has a working prototype for this.

tom-webber commented 3 months ago

Jupyter notebook Image relationship / hiearchy diagram and image content summaries

Gary-H9 commented 2 months ago

The approach outlined in the ticket above is no longer the approach that will be taken in this work.

The new approach is (outlined in this ticket) to build a basic JupyterLab image which (pending user research) we add required packages to, or rely on users to setup their applications to install packages.

The new image relies on the AP Cloud Development Base Image which includes Python 3.12 and Conda.

I am going to investigate the datascience and all-spark notebooks to better understand what packages may be required from the users and if we can install them or if we can instruct them on how to install them.

Gary-H9 commented 2 months ago

Updated the description of this issue to match the work underway.

Created a Miro board describing the layering of the images. This can be expanded upon / used to explain our desire for this move to the users (if needed). This can also be used in the RStudio work.

On the presumption that users will ask for this as part of the user research, I successfully managed to install the R kernel onto the existing jupyterlab build. I'll convert this to be done programmatically and create a branch.

🚧 Testing done in this branch. It seems that installing via conda is the cleanest method to have the R kernel appear in the JupyterLab launcher. Bizarelly this downgrades R from 4.4.1 to 4.4.0. As before, it depends on UR if this is needed or not, we can simply provide some instructions based on this to the users, if desired.

Gary-H9 commented 2 months ago

Blocked by User Research.

Gary-H9 commented 2 weeks ago

Progressing this now we have a few issues to complete post user research.

https://github.com/orgs/ministryofjustice/projects/27/views/18?pane=issue&itemId=85562596&issue=ministryofjustice%7Canalytical-platform%7C5901

jacobwoffenden commented 2 weeks ago

Moving to blocked while we figure out a rollout plan

Gary-H9 commented 2 weeks ago

Created this PR to be merged on agreement of our release schedule.

Gary-H9 commented 1 week ago

nbstripout merged.