This PR creates training data using data sourced from the lake-temperature-model-prep pipeline. It does the following:

Pair NLDAS drivers with lakes using the meteo crosswalk from 7_config_merge/out/nml_Kw_values.rds
Add static clarity values to the lake metadata (not being used for training yet)
Change the model-prep rules that form the lake metadata and meteo crosswalk csvs into checkpoints. They need to be checkpoints because their outputs are used inside functions dynamic_filenames_model_prep and get_lake_sequence_files. If they weren't checkpoints, those files wouldn't be created before the functions are called, and the pipeline wouldn't work.
Altered lake_sequences.py to accommodate both MNTOHA and model-prep data

Here's a DAG for forming the model-prep training data: training_data_model_prep

How to run the code

I've built out the pipeline up through rule create_training_data. So, to run this part of the pipeline:

snakemake -c1 2_process/out/model_prep/train.npz

How to review this PR

The main things I'd like reviewed are:

Anything that's been changed is fair game, but the majority of the changes are in 2_process.smk.

Form validation set during 2_process
Save more metadata alongside train.npz, validate.npz, and test.npz for use during model evaluation
Change directory structure to be more nested