DOI-USGS / lake-temperature-model-prep

Pipeline #1
Other
6 stars 13 forks source link

Scaling GCM munge steps to full footprint #346

Closed lindsayplatt closed 2 years ago

lindsayplatt commented 2 years ago

Fixes #279 (but not #280). Ended up with tiles of 20x20 cells. With these updates, the full build took just over 15 hours (the download piece alone was 13, which was less than what it was for just MN locally).

Looking for @hcorson-dosch to review:

Looking for @jesse-ross to review all the others (which he mostly has already):

-- Late additions below this line --

Forgot to include this quick little code snippet for how I determined the time required to run the sub-pipeline.

calc_minutes <- function(metadata) {
  round(sum(metadata$seconds, na.rm = TRUE)/60, 2)
} 

tar_meta(starts_with('gcm_data_raw_feather')) %>% calc_minutes() # Took 13.9 hours to download
tar_meta(gcm_nc) %>% calc_minutes() # Took 17 minutes to convert to NetCDFs.
tar_meta() %>% calc_minutes() # The full pipeline took 15.2 hours
jesse-ross commented 2 years ago

@lindsayplatt I set up the .sif file on tallgrass, and created the symlink with ln -s lake-temperature-model-prep_v0.1.sif lake-temperature-model-prep.sif, so once you pull the changes down on tallgrass, you should be ready to submit to SLURM! We may need to iterate on the container, so please let me know if it runs into problems.

lindsayplatt commented 2 years ago

@jesse-ross thank you! Can't wait to try it out