JGCRI / fldgen

Given a global mean temperature pathway, generate random global climate fields consistent with it and with spatial and temporal correlation derived from an ESM
https://jgcri.github.io/fldgen/
GNU General Public License v2.0
12 stars 6 forks source link

acs-intermediate fix to saved fldgen memory bloat #49

Closed abigailsnyder closed 4 years ago

abigailsnyder commented 4 years ago

an intermediate fix to https://github.com/JGCRI/fldgen/issues/25

Reducing what we save from a trained emulator to the bare bones list entries needed for generating new fields. This takes the ISIMIP GFDL trained emulator from 5.6gb for everything to 2.1gb. Hopefully this is small enough to work in the Cassandra pipeline.

The scripts in fldgen/inst/scripts are copies of the same files in /pic/projects/GCAM/GE/drought-expt/fldgen-emulators, with fldgen/inst/scripts/train-emulators.R updated to include a call to the new emulator_reducer function and save the smaller emulators. So in theory those can be re-built from scratch, or the existing emulators in /pic/projects/GCAM/GE/drought-expt/fldgen-emulators can each be loaded, reduced, saved with a different script, following what has been added to fldgen/inst/scripts/train-emulators.R.

Note that when you load an RDS object from fldgen/inst/scripts/train-emulators.R, it will come in with the name reducedEmulator and not emu. So either the python code has to be adjusted or this piece of code in the training script:

 reducedEmulator <- emulator_reducer(emu)
        outfilename <- paste0('fldgen-',model, '_reducedEmulator.rds')
        saveRDS(reducedEmulator, outfilename)

Would have to be redone as

 emu <- emulator_reducer(emu)
        outfilename <- paste0('fldgen-',model, '_reducedEmulator.rds')
        saveRDS(emu, outfilename)

Left distinct for now, so clear in the fldgen package and cassandra users can amend according to their own preference.

The pointers in the cassandra directory on pic will have to be updated to point to these smaller emulators.

abigailsnyder commented 4 years ago

only test failing is the devel on macOS, due to ncdf4 as usual.

abigailsnyder commented 4 years ago

Once PR is merged in, @abigailsnyder will use the new function to create reduced size versions of all emulators in /pic/projects/GCAM/GE/drought-expt/fldgen-emulators/ with the correct variable names. Not updating any of the trained emulators contents, not changing science. Just saving a copy of what is already being used with fewer things in it.