JGCRI / cassandra

Human-earth system multi-scale model coupling framework
Other
5 stars 3 forks source link

Add Tgav Stub Component #61

Open crvernon opened 4 years ago

crvernon commented 4 years ago

Desire was to create a Tgav stub component to facilitate extracting data for a specific configuration (e.g., scenario, etc.) from ESM runs that were used to train the fldgen emulator.

Per @abigailsnyder:

"If ESM runs are all ISIMIP data, the quantity with the variable name tgav is actually land average temperature and not global average temperature as the name suggests." "A compiled Rmarkdown notebook doing this extraction and labeling for a trained emulator from /pic/projects/GCAM/GE/drought-expt/fldgen-emulators extracting-tgav-from-trained-emu.zip"

The component named TgavStubComponent was added to Cassandra and provides the following capabilities:

The configuration block for the TgavStubComponent is the following:

[TgavStubComponent]
rds_file = <full path with filename and extension to the target RDS file>
climate_var_name = tasAdjust
scenario = rcp26
units = Kelvin
start_year = 1861
through_year = 2099

This component conducts validation at run time as well as being equipped with a robust test suite. All tests for this component are now passing locally. The metadata output is a summary of both the Tgav configuration as well as the data itself. The following is logged from the tgav_metadata capability:

INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: rds_file==/Users/d3y010/projects/cassandra/models/fldgen/fldgen-IPSL-CM5A-LR.rds
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: scenario==rcp26
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: climate_var_name==tasAdjust
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: source_climate_data==./training-data/tasAdjust_annual_IPSL-CM5A-LR_rcp26_18610101-20991231.nc
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: units==Kelvin
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: count==239
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: mean==286.8046116940164
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: median==286.30762280534697
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: min==284.78008340211915
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: max==288.6382439866686
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: std==1.182328423261446
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: na_count==0
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: null_count==0
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: all_finite==True

There are currently no constraints on these reported factors though they can be added easily if so desired. @kdorheim provided a summary of over 50 CMIP5 runs that we could use as min and max constraints from Taylor (2012) and stored here /pic/projects/GCAM/Dorheim/hectorcal/data-raw/CMIP5_annual_global_average.csv:

variable unit experiment min max mean
tas K historical 284.3061 289.1264 286.7314

Please also consider constraints for finite, no data, and NaN if they were to be present in the data.

The following full run for this configuration was completed successfully:

[Global]
#  Location of the jar file for the ModelInterface code, used to query GCAM outputs.
ModelInterface = /Users/d3y010/projects/cassandra/data/ModelInterface.jar

# Location of the DBXML libraries used by older versions of the ModelInterface code.
DBXMLlib = /Users/d3y010/projects/cassandra/data/lib

# Directory containing general input files.  (OPTIONAL - default is './input-data').
# Relative paths will be interpreted relative to the working directory
# (even if they don't begin with './')
inputdir = ./input-data
rgnconfig = rgn32

[TgavStubComponent]
rds_file = /Users/d3y010/projects/cassandra/models/fldgen/fldgen-IPSL-CM5A-LR.rds
climate_var_name = tasAdjust
scenario = rcp26
units = Kelvin
start_year = 1861
through_year = 2099

[XanthosComponent]
config_file = /Users/d3y010/projects/cassandra/models/xanthos/trn_abcd_IPSL-CM5A-LR.ini
OutputNameStr = trn_abcd_IPSL-CM5A-LR_rcp26_0_alt
ProjectName = trn_abcd_IPSL-CM5A-LR_rcp26_0_alt
mp.weight = 8.0

[FldgenComponent]
loadpkgs = False
pkgdir = .
emulator = /Users/d3y010/projects/cassandra/models/fldgen/fldgen-IPSL-CM5A-LR.rds
ngrids = 2
scenario = rcp26
a2mfrac = alpha_ipsl_cm5a_lr
startyr = 1861
nyear = 239
RNGseed = -197043577
mp.weight = 10.0

With a logfile of:

DEBUG:root:General parameters as input:
DEBUG:root:{}
INFO:root:running <class 'cassandra.components.GlobalParamsComponent'>
DEBUG:root:starting <class 'cassandra.components.GlobalParamsComponent'>
INFO:root:[None]: default path= None  filename= /Users/d3y010/projects/cassandra/data/ModelInterface.jar
INFO:root:[None]: default path= None  filename= /Users/d3y010/projects/cassandra/data/lib
INFO:root:running <class 'cassandra.components.TgavStubComponent'>
INFO:root:[None]: default path= /Users/d3y010/repos/github/cassandra/cassandra  filename= ./input-data
DEBUG:root:starting <class 'cassandra.components.TgavStubComponent'>
INFO:root:[None]: default path= /Users/d3y010/repos/github/cassandra/cassandra/input-data  filename= rgn32
INFO:root:running <class 'cassandra.components.XanthosComponent'>
DEBUG:root:<class 'cassandra.components.GlobalParamsComponent'>: finished successfully.

DEBUG:root:starting <class 'cassandra.components.XanthosComponent'>
INFO:root:running <class 'cassandra.components.FldgenComponent'>
DEBUG:root:completed <class 'cassandra.components.GlobalParamsComponent'>
DEBUG:root:starting <class 'cassandra.components.FldgenComponent'>
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: rds_file==/Users/d3y010/projects/cassandra/models/fldgen/fldgen-IPSL-CM5A-LR.rds
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: scenario==rcp26
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: climate_var_name==tasAdjust
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: source_climate_data==./training-data/tasAdjust_annual_IPSL-CM5A-LR_rcp26_18610101-20991231.nc
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: units==Kelvin
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: count==239
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: mean==286.8046116940164
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: median==286.30762280534697
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: min==284.78008340211915
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: max==288.6382439866686
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: std==1.182328423261446
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: na_count==0
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: null_count==0
INFO:root:<class 'cassandra.components.TgavStubComponent'> 'Tgav' data summary: all_finite==True
DEBUG:root:<class 'cassandra.components.TgavStubComponent'>: finished successfully.

DEBUG:root:completed <class 'cassandra.components.TgavStubComponent'>
DEBUG:root:Result for tas: len = 2. Shape = (67420, 2868)
DEBUG:root:Result for pr: len = 2. Shape = (67420, 2868)
DEBUG:root:<class 'cassandra.components.FldgenComponent'>: finished successfully.

DEBUG:root:completed <class 'cassandra.components.FldgenComponent'>
INFO:root:ProjectName : trn_abcd_IPSL-CM5A_gexp
INFO:root:InputFolder : /Users/d3y010/repos/github/xanthos/example/input
INFO:root:OutputFolder: /Users/d3y010/repos/github/xanthos/example/output/trn_abcd_IPSL-CM5A_gexp
INFO:root:StartYear - End Year: 1861-2099
INFO:root:Number of Months    : 2868
INFO:root:Running: Future Mode
INFO:root:TempMinFile variable not found for the ABCD runoff module; Snowmelt will not be accounted for.
INFO:root:---Simulation in progress...
INFO:root:  Processing PET...
INFO:root:  PET processed in 17.734493017196655 seconds---
INFO:root:  Processing Runoff...
INFO:root:      Processing spin-up and simulation for basins 1...235
INFO:root:  Runoff processed in 45.51595687866211 seconds---
INFO:root:---Simulation has finished successfully: 63.26426696777344 seconds ---
INFO:root:---Output simulation results:
DEBUG:root:Outputting data annually
DEBUG:root:Unit is km3peryear
DEBUG:root:q output dimension is (67420, 239)
INFO:root:Aggregating by Basin
INFO:root:Aggregated unit is km3peryear
INFO:root:---Output finished: 8.618645906448364 seconds ---
INFO:root:End of trn_abcd_IPSL-CM5A_gexp

I also checked the Xanthos outputs for global runoff per year resulting in the following:

test_runoff

Resolves #58

crvernon commented 4 years ago

CI will fail right now due to the previous version of the .travis.yml not loading an R environment.

claudiatebaldi commented 4 years ago

Looks good to me. Only one comment, is there a way to keep track of the fact that Tgav sometime is defined as land-only and sometime (most of the times) refers to land+ocean average?

crvernon commented 4 years ago

@claudiatebaldi Thanks! I am not sure and will have to think about this. I think it would be assumed from the parent model in use. Perhaps we just make that as a parameter for the config file where the user can specify. Although, aren't the land-only values taking the land+ocean interactions into account anyway even if they are not reporting ocean grid values?

claudiatebaldi commented 4 years ago

Not sure what you mean with the "although" question, Chris. My point being only that if one is using a Tgav series it would be good to know if it is a global average or just land-only.But then again I'm not sure how this would work operationally, so it may be a redundant piece of information, already obvious to the user

kdorheim commented 4 years ago

Not sure what you mean with the "although" question, Chris. My point being only that if one is using a Tgav series it would be good to know if it is a global average or just land-only.But then again I'm not sure how this would work operationally, so it may be a redundant piece of information, already obvious to the user

I agree with @claudiatebaldi this is useful information to have and while this might be obvious to the original user I think it would be extremely useful to specify this sort of information for other users in order to understand what is actually being read into the pipeline.

bpbond commented 4 years ago

FYI, note that SULI Skylar Gering's project this summer is to split Hector Tgav into separate land and ocean variables, following e.g. MAGICC, with land Tgav warming faster in general.

abigailsnyder commented 4 years ago

@crvernon note this PR in the fldgen repo (implementing updates discussed in group call about 3.5 or 4 weeks ago) that may be useful https://github.com/JGCRI/fldgen/pull/49