NCAR / CUPiD

CUPiD is a “one stop shop” that enables and integrates timeseries file generation, data standardization, diagnostics, and metrics from all CESM components.
https://ncar.github.io/CUPiD/
Apache License 2.0
25 stars 24 forks source link

Add ILAMB example #109

Closed mnlevy1981 closed 4 months ago

mnlevy1981 commented 5 months ago

Update (cupid-analysis) to include ILAMB v2.7, and then create examples/ilamb.

mnlevy1981 commented 5 months ago

From the commit log for de42ba2:

cd examples/ilamb
qinteractive -l select=1:ncpus=16:mpiprocs=16:mem=100G
conda activate cupid-analysis
./test_run.sh

will run, but every line of output mentions MisplacedData, so I don't think it's configured quite right. Note that test_run.sh sets ILAMB_ROOT (to ${CUPID_ROOT}/ilamb_aux) and then runs

$ mpiexec ilamb-run --config ilamb_nohoff_final_CLM.cfg \
                    --build_dir bld/ \
                    --df_errs ../../ilamb_aux/quantiles_Whittaker_cmip5v6.parquet \
                    --define_regions ../../ilamb_aux/DATA/regions/LandRegions.nc ../../ilamb_aux/DATA/regions/Whittaker.nc \
                    --regions global \
                    --model_setup model_setup.txt \
                    --filter .clm2.h0.

I'd like to improve how we specify where the output is located (maybe auto-generate model_setup.txt?) and plug this all into cupid-run, but first we need to get ilamb-run working

mnlevy1981 commented 5 months ago

as of 59f1507 this example is actually running ILAMB, so the one hour walltime is probably insufficient. The script TSS / LMWG folks run uses 12:00:00, so maybe we should do

cd examples/ilamb
qinteractive -l select=1:ncpus=16:mpiprocs=16:mem=100G -l walltime=12:00:00
conda activate cupid-analysis
./test_run.sh
TeaganKing commented 5 months ago

Just a note-- In attempting to check the walltime this uses, I'm getting a bunch of ILAMB.ilamblib.VarNotInModel: VarNotInModel errors. I think we may need to update the configuration file to address this.

TeaganKing commented 5 months ago

A couple of notes from our meeting that are relevant to this PR (should probably be done in this PR; could also merge this PR as a proof of concept and then add an issue ticket for these items):

For debugging ILAMB, it will be useful to delete the bld directory and try restarting. We want to ensure that we're using Keith's ILAMB LMWG version. The main ILAMB also doesn't have CLM specific config files-- they may want to add these in, though. Keith's branch should eventually get brought back in to the main ILAMB repo, but this may take a while still so we are temporarily sticking with Keith's branch.

TeaganKing commented 4 months ago

This version is now working as far as running ILAMB. The aforementioned features should still be implemented. One additional note to record here is that the parallel efficiency with CUPiD on Derecho (21%) seems to be less than that that Keith had on Casper (46%).

TeaganKing commented 4 months ago

This new issue ticket includes details for automatically updating the configuration files once CUPiD is set up to run from the CESM workflow. For now, we'd like to merge this in as is.

mnlevy1981 commented 4 months ago

Yeah, github won't let me give this a green check but I think it looks great (and #115 highlights what's missing so we can improve on the implementation in the future).