This repository contains material first presented at the 2022 conference of the Society of Research Software Engineering as a walkthrough on MLOps for RSEs, covering the key aspect of MLOps that RSEs will be expected to support and engage with broadly. The material is primarily a series of Jupyter notebooks.
To run the notebooks, you will need to go through the following steps (see subsections below for more details on each step):
From the command line, run the following command:
git clone https://github.com/informatics-lab/ukrse_2022_mlops_walkthrough.git
then navigate to the root directory of your local copy of the repository
First install anaconda or miniconda if you do not already have them installed. Follow the instruction on the conda website to do this.
Once you have installed conda (via anaconda or miniconda), you then need to set up the relevant conda environments on your machine. For this walkthrough we have several different conda environments, representing good practice to create separate environments for different stages and tasks in a typical machine learning learning project or pipeline.
You can create the environment it describes with the following command:
conda env create --file requirements.yml
The following environments should be installed for this walkthrough:
requirements_data_prep.yml
requirements_model_development.yml
requirements_model_evaluation.yml
The data used in this walkthrough as an archive on Zenodo. Unfortunately you can't easily just download all the files in a Zenodo record, so you will need to download each. This can be done with the wget commands listed below. It is recommended that these be placed in a directory in ~/data/ukrse2022.
wget https://zenodo.org/record/6966937/files/2021_met_office_aviation_rotors.csv -P ~/data/ukrse2022/
wget https://zenodo.org/record/7022648/files/2021_met_office_aviation_rotors_preprocessed.csv -P ~/data/ukrse2022/
wget https://raw.githubusercontent.com/informatics-lab/ukrse_2022_mlops_walkthrough/main/rotors_catalog.yml -P ~/data/ukrse2022/
wget https://zenodo.org/record/7022648/files/era5_mslp_UK_2017_2020.nc -P ~/data/ukrse2022/
Navigate to the repository root, if you are not already there. Activate one of the conda environments, for example
conda activate ukrse2022_mlops_data_prep
Then run jupyter lab with the following command
jupyter lab
The Jupyter Lab interface will pop up in your default browser.