informatics-lab / ukrse_2022_mlops_walkthrough

Material for UK RSE 2022 Conference Walkthrough on MLOps for RSEs
15 stars 1 forks source link

MLOps for RSEs

This repository contains material first presented at the 2022 conference of the Society of Research Software Engineering as a walkthrough on MLOps for RSEs, covering the key aspect of MLOps that RSEs will be expected to support and engage with broadly. The material is primarily a series of Jupyter notebooks.

Running the material

To run the notebooks, you will need to go through the following steps (see subsections below for more details on each step):

Clone the repository

From the command line, run the following command:

git clone https://github.com/informatics-lab/ukrse_2022_mlops_walkthrough.git

then navigate to the root directory of your local copy of the repository

Set up a conda environment

First install anaconda or miniconda if you do not already have them installed. Follow the instruction on the conda website to do this.

Once you have installed conda (via anaconda or miniconda), you then need to set up the relevant conda environments on your machine. For this walkthrough we have several different conda environments, representing good practice to create separate environments for different stages and tasks in a typical machine learning learning project or pipeline.

You can create the environment it describes with the following command: conda env create --file requirements.yml

The following environments should be installed for this walkthrough:

Get the data

The data used in this walkthrough as an archive on Zenodo. Unfortunately you can't easily just download all the files in a Zenodo record, so you will need to download each. This can be done with the wget commands listed below. It is recommended that these be placed in a directory in ~/data/ukrse2022.

Run Jupyter Lab

Navigate to the repository root, if you are not already there. Activate one of the conda environments, for example

conda activate ukrse2022_mlops_data_prep

Then run jupyter lab with the following command

jupyter lab

The Jupyter Lab interface will pop up in your default browser.

Links