pSIMS is a suite of tools, data, and models developed to facilitate access to high-resolution climate impact modeling. This system largely automates the labor-intensive processes of creating and running data ingest and transformation pipelines and allows researchers to use high-performance computing to run simulations that extend over large spatial extents, run for many growing seasons, or evaluate many alternative management practices or other input configurations. In so doing, pSIMS dramatically reduces the time and technical skills required to investigate global change vulnerability, impacts and potential adaptations. pSIMS is designed to support integration and high-resolution application of any site-based climate impact model that can be compiled in a Unix environment (with a focus on primary production: agriculture, livestock, and forestry).
For more information about pSIMS, please see the following paper:
Elliott, Joshua, et al. (2014) The parallel system for integrating impact models and sectors (pSIMS). Environmental Modelling & Software 62: 509-516. link to the paper
The original pSIMS was developed in 2014 (see paper above), We updated pSIMS to pSIMSV2 which is able to run the soft in an Unix Environments with all dependencies installed by a singularity container without the need to install the soft dependencies manually (as in pSIMS). Also, we updated some packages were obsolete.
The singularity image to run pSIMSV2 are hosted here or here for APSIM Classic 7.9. All examples below are an implementation of pSIMSV2 for APSIM 7.9.
Regional scale estimations of sorghum biomass are crucial to identify optimum genotype × environment × management (G×E×M) combinations to ensure potential biomass production for bioenergy. This work was part of the TERRA project. Using pSIMSV2 we explored the following questions: which factors (G, E and M) are dominant in explaining sorghum biomass variability? and how do the drivers of sorghum biomass variability change with genotype and irrigation strategy at regional scale in the US?
Using the APSIM gridded platform (pAPSIM) within pSIMSV2, four genotypes [grain (GS), sudangrass (SS), photosensitive, (PS) and photo-insensitive (PI)] were simulated across the potential areas for energy sorghum in the US under rainfed and irrigated conditions over 30 years.
This software list is installed automatically when the singularity image is used for runs including several tiles. When the user is testing single tiles and not using singularity, it needs to be installed manually.
Note: this packages are in the requirements.txt file at psims/pysims/
Data inputs are provided for download and other are hosted in this repo.
Two full global gridded dataset available for pSIMS users:
Due to the size of these datasets, they are available only via Globus online. If you do not already have a Globus account, you may create one at globus.org. The endpoint name is davidk#psims. Harmonized World Soil Database files are available in the .../soils/hwsd200.wrld.30min directory. AgMERRA climate data is available in the .../clim/ggcmi/agmerra directory.
You can also create your own datasets (or use others) and pass it to this tool.
A template to convert Daymet climate data to a pSIMS format was created by the authors of this repo using the PyDaymet Python package (contact them for more information). For PyDaymet information please visit the GitHub repo.
This template aggregates climate daymet data from 1 km resolution to 30 arc-minute resolution to be inputted in pSIMS and following the format established in psims2met.py (see translators folder in this repo).
The parameter file is a YAML-formatted file containing all the parameters of a pSIMSV2 run. It defines things like the number of simulation years, the path to climate input files, and which model to use. Below is a list of parameters and a description of what it does.
Parameter | Description |
---|---|
aggregator | Aggregator options, used to average a variable across a region |
checker | Checker translator and options, check if a tile should be simulated or not |
delta | Simulation delta, gridcell spacing in arcminutes |
executable | Name of executable and arguments to run for each grid |
lat_zero | Top edge of the North most grid cell in the campaign |
lon_zero | Left edge of the West most grid cell in the campaign |
long_names | Long names for variables, in same order that variables are listed |
model | Defines the type of model to run. Valid options are dssat45, dssat46, apsim75 |
num_lats | Number of latitudes to be included in final nc4 file (starting with lat_zero) |
num_lons | Number of longitudes to be included in final nc4 file (starting with lon_zero) |
num_years | Number of years to simulate |
out_file | Defines the prefix of the final nc4 filename |
outtypes | File extensions of files to include in output tar file |
refdata | Directory containing reference data. Will be copied to each simulation |
ref_year | Reference year (the first year of the simulation) |
scens | Number of scenarios in the campaign |
soils | Directory containing soils |
tappcmp | Campaign translator and options |
tappinp | Input translator and options, goes from experiment.json and soil.json to model specific files |
tapptilewth | Weather tile translator and options |
tapptilesoil | Soil tile translator and options |
tappnooutput | The "no output" translator and options, typically used to create empty data |
tappwth | Weather translator and options, converts .psims.nc format into model specfic weather files |
tdelta | Tile delta gridcell spacing in arcminutes |
postprocess | Name of translator and options to run after running executable |
var_units | Units to use for each variable, in the same order that variables are listed |
variables | Define the variables to extract to final outputs |
weather | Defines the directory where weather data is stored |
Below we provide an example of a parameter file:
model: apsim79
weather: .../agmerra2degtile
soils: .../gsde2degtile
refdata: .../refdata
out_file: output
executable: ...
outtypes: .met,.apsim,.out,.json,.txt
ref_year: 1980
num_years: 30
scen_years: 30
scens: 8
delta: "30,30"
tdelta: "120,120"
num_lats: 52
num_lons: 72
lat_zero: 49.75
lon_zero: -107.75
irr_flag: true
irr_1st: false
# Variables to extract
variables: planting_date,biomass,rad40DAS,rad80DAS,radHarv,temp40DAS,temp80DAS,tempHarv,rain40DAS,rain80DAS,rainHarv,RadiationIn,TempIn,aMinT,aMaxT,RainIn,sw_stress_expan,PAWC,DaysAfterSowing,FloweringDAS,IrrigationIn,sw_stress_photo,N_stress_expan,N_stress_photo,WU,potential_ET,actual_ET,LeafNo,MaxLAI,ESW1av,sw0_40,sw40_80,sw80_harv,tp0_40,tp40_80,tp80_harv,ri0_40,ri40_80,ri80_harv,DOY
var_units: "date,kg/ha,MJ/m2,MJ/m2,MJ/m2,oC,oC,oC,mm,mm,mm,MJ/m2,oC,oC,oC,mm,(0-1),mm,days,days,mm,,,mm,mm,mm,,,,,,,,,,,,,,"
long_names: "planting_date,biomass,rad40DAS,rad80DAS,radHarv,temp40DAS,temp80DAS,tempHarv,rain40DAS,rain80DAS,rainHarv,RadiationIn,TempIn,aMinT,aMaxT,RainIn,sw_stress_expan,PAWC,DaysAfterSowing,FloweringDAS,IrrigationIn,sw_stress_photo,N_stress_expan,N_stress_photo,WU,potential_ET,actual_ET,LeafNo,MaxLAI,ESW1av,sw0_40,sw40_80,sw80_harv,tp0_40,tp40_80,tp80_harv,ri0_40,ri40_80,ri80_harv,DOY"
# Only simulate points in the crop mask
#checker:
# class: SimpleChecker
# simgfile: .../masks/masks/cropmask.nc4
# Campaign translator
tappcmp:
class: camp2json
campaignfile: Campaign.nc4
expfile: exp_template.json
outputfile: experiment.json
# Input translator
tappinp:
class: apsim75.jsons2apsim
soilfile: soil.json
soiltile: 1.soil.nc4
expfile: experiment.json
templatefile: template.apsim
outputfile: Generic.apsim
# Weather translator
tappwth:
class: apsim79.psims2met
inputfile: 1.clim.nc4
variables: tasmin,tasmax,rsds,pr,wind
outputfile: Generic.met
# Post processing translation
postprocess:
class: apsim79.out2psims
inputfile: Generic.out
tapptilewth:
class: tile_translator
tapptilesoil:
class: tile_translator_soil
tappnooutput:
class: nooutput2psims
When pysims is run, the user must specify a campaign directory with the --campaign parameter. Typically this campaign directory contains two relevant files named Campaign.nc4 and exp_template.json. These files are used by the jsons2dssat and jsons2apsim translators to create experiment files for the crop model.
The exp_template.json file contains key-value pairs for data that will be written to the experiment file. These values represent things like fertiliser rate applications, irrigation rates and timing and planting dates. Static settings for the experiment are stored in exp_template.json. Values that vary by lat, lon, scenario, or time get stored in Campaign.nc4.
Below is an example of a exp_template.json for a sorghum experiment:
{
"crop_name": "Sorghum",
"start_date": "01/01/1980",
"end_date": "31/12/1985",
"log": "",
"reporting_frequency": "harvesting",
"output_variables": [
{"name": "dd/mm/yyyy as Date"},
{"name": "planting_date"},
{"name": "biomass"},
{"name": "ExtinctionCoef"},
{"name": "radInt"},
{"name": "PAWC"},
{"name": "WU"},
{"name": "potential_ET"},
{"name": "DaysAfterSowing"},
{"name": "IrrigationIn"},
{"name": "FloweringDAS"},
{"name": "LeafNo"},
{"name": "MaxLAI"},
{"name": "RainIn"},
{"name": "TempIn"},
{"name": "aMinT"},
{"name": "aMaxT"},
{"name": "RadiationIn"},
{"name": "FertiliserIn"},
{"name": "actual_ET"}
],
"initial_condition": {
"icrn": ".5",
"icrip": "100",
"icnd": "0",
"icrp": "0",
"icrt": "300",
"icrz#": "1",
"icrze": "1",
"icrag": "500",
"icrdp": "30",
"icdat": "19800101",
"standing_fraction": "0",
"water_fraction_full": "1",
"soilLayer": [
{"icno3": ".1" ,"icbl": "5" ,"icnh4": ".1"},
{"icno3": ".1" ,"icbl": "15" ,"icnh4": ".1"},
{"icno3": ".1" ,"icbl": "30" ,"icnh4": ".1"},
{"icno3": ".1" ,"icbl": "100","icnh4": ".1"},
{"icno3": ".1" ,"icbl": "200","icnh4": ".1"}
]
},
"weather": {
"file": "Generic"
},
"planting": {
"pdate": "15-may",
"edate": "15-dec",
"cultivar": "medium",
"row_spacing": "0.7",
"depth": "30",
"sowing_density": "8",
"skiprow": "solid",
"ftn": "2"
},
"fertilizer": {
"automatic_fertilizer": "off",
"fert_criteria": "100",
"fert_critical": "90",
"type_auto": "NO3_N",
"initial_amount": "200",
"type": "NH4NO3",
"days_after_sowing": "45",
"subsequent_amount": "200",
"depth": "40"
},
"irrigation": {
"automatic_irrigation": "off",
"asw_depth": "2000",
"crit_fr_asw": "1",
"efficiency": "1",
"allocation_limits": "off",
"allocation": "10",
"default_no3_conc": "0.0",
"default_nh4_conc": "0.0",
"default_cl_conc": "0.0"
},
"reset": {
"date": "14-may",
"water": "yes",
"nitrogen": "yes",
"surfaceOM": "yes"
}
}
But users may not want to these settings everywhere. If they have planting dates (pdate) that change by location, users may create a variable in Campaign.nc4 called pdate. The most basic version of this would be a NetCDF variable in the format of float pdate(lat, lon). When pysims runs for a given point, the appropriate value would transfer from Campaign.nc4 into the experiment file. If pdate is not defined in Campaign.nc4, the static value from exp_template.json is used instead (in this example on 15 May). This process works the same for all variables, not just limited to pdate.
Below is an example of a Campaign.nc4 for a sorghum experiment. In this experiment we use variable planting dates (pdate), irrigation (automatic_irrigation) and genotype (cultivar). We combined 2 irrigation strategies, 4 genotypes and variable planting date by tile.
{
dimensions:
lat = 360;
scen = 8;
lon = 720;
variables:
double lat(lat=360);
:units = "degrees_north";
:_Storage = "contiguous";
float cultivar(scen=8, lat=360, lon=720);
:long_name = "GS,GS,SS,SS,FSPS,FSPS,FS,FS";
:_DeflateLevel = 5; // int
:units = "Mapping";
float scen(scen=8);
double lon(lon=720);
:units = "degrees_east";
:_Storage = "contiguous";
float pdate(scen=8, lat=360, lon=720);
:long_name = "Planting date";
:_DeflateLevel = 5; // int
:units = "Julian day";
int automatic_irrigation(scen=8);
:units = "Mapping";
:long_name = "off,on,off,on,off,on,off,on";
// global attributes:
:person_notes = "Jonathan Ojeda";
:history = "pSIMS setup for sorghum modelling in USA";
}
Below we provide a visual example of the variable planting date across a region in the US included in the campaign file:
In the refdata folder there is an apsim file (.apsim) called template.apsim. This file is the template apsim file pSIMSV2 will use for every single simulation to create apsim outputs. Here is the place were specific output variables need to be created, for example the following code calculates the accumulated irrigation applied from sowing to harvest:
<variable>sum of irrigation on end_of_day from sowing to harvesting as IrrigationIn</variable>
Then this variable need to be specified in the parameter file as an output as 'IrrigationIn' to be reported in the APSIM report.
The refdata folder also contains template apsim XML files for all crops. Any change to the APSIM code (for example the implementation of new cultivar parameters) need to coded here.
In the parameter file there is an option to enable a mask nc file to be implemented. If the users enable this option, they will need to indicate the location of this file.
# Only simulate points in the crop mask
checker:
class: SimpleChecker
simgfile: .../masks/masks/cropmask.nc4
The aggregation script is responsible for taking the final output of a psims simulation and computing the average value for a variable across some geographic region. To enable aggregation, add a section named 'aggregator' to your parameters file with the following parameters:
Parameter | Description |
---|---|
aggfile | Location of an aggfile. The aggfile contains information about geographic boundries at given lats/lons. Common uses here are gadm regions and food producing units. |
weightfile | Location of the weightfile, used to give certain geographic areas more weight than others |
levels | Comma separated list of levels from the aggfile (example: gadm0, gadm1, gadm2) |
The aggfile and weightfile must match the resolution used in your simulation. To generate a new aggfile you can use the gdal_rasterize utility to convert from a gadm shapefile to a netcdf file, then use bin/create_agg_limits.py to add the required variables and dimensions.
Example parameters:
aggregator:
aggfile: /path/to/agg.nc
weightfile: /path/to/weight.nc
levels: gadm0
A tilelist file contains a list of latitudes and longitudes indexes to be processed, in the format of "latidx/lonidx". Here is an example:
0024/0044
0024/0045
Tile number can be calculated as follows:
for 0024_0044 (latidx=24; lonidx=44):
Before to run pSIMSV2 you should check:
Note: Be sure you type sudo -s and put the password before to start.
/psims/pysims/pysims.py --param .../params.apsim.sample --campaign .../campaign/created_campaign/test3/ --tlatidx 0024 --tlonidx 0044 --latidx 0096 --lonidx 0173
/psims/pysims/pysims.py --param .../params.apsim.sample --campaign .../campaign/created_campaign/test3/ --tlatidx 0024 --tlonidx 0044
In this case the command is looking for the psims folder because for more than a tile swift is implemented to apply parallel computing. Therefore, remember to install swift in your computer before to run several tiles together.
./psims -s local -p /psims/pysims/params.apsim.sample -c .../campaign/sorghum/ -t .../TileLists/test -r jon
.../psims/pysims/pysims.py --param .../params.apsim.sample --campaign .../campaign/created_campaign/test2/ --tlatidx 0024 --tlonidx 0044 --latidx 0096 --lonidx 0173
.../psims/pysims/pysims.py --param .../params.apsim.sample --campaign .../campaign/created_campaign/test2/ --tlatidx 0024 --tlonidx 0044
In this command, pSIMSV2 is using swift to do runs in parallel, so it was already installed by the singularity image.
singularity exec -B /data:/data -B /run/shm:/run/shm .../PSIMs.Apsim79.sapp .../psims/psims -s local -p .../paramsFiles/PetePC/params.apsim.sample -c .../campaign/created_campaign/test3/ -t .../TileLists/sorghumEnergy
export PYTHONNOUSERSITE=1
.../shfiles/pysims.sh --param .../params.apsim.sample --campaign .../campaign/created_campaign/test2/ --tlatidx 0024 --tlonidx 0044 --latidx 0096 --lonidx 0173
export PYTHONNOUSERSITE=1
.../shfiles/pysims.sh --param .../params.apsim.sample --campaign .../campaign/created_campaign/test2/ --tlatidx 0024 --tlonidx 0044
Note: this description was done for the code implemented to run pSIMSV2 in a computer through remote control.
-s: indicates if the run is implemented in a computer (local) or in a cluster (cluster).
-p: indicates the parameter file location (e.g. .../paramsFiles/PetePC/params.apsim.sample)
-c: indicates the campaign file location (e.g. .../campaign/created_campaign/test3/)
-t: indicates the tile file location (e.g. .../TileLists/sorghumEnergy)
The output/ directory contains a directory for each latitude being processed. Within each latitude directory, a tar.gz file exists for each longitude. For example, if your gridList contained a grid 100/546, you would see an output file called runNNN/output/100/546output.tar.gz. This file is generated from within the Swift work directory. Which files get included in the file is determined by how you set "outtypes" in your parameter file.
The parts/ directory contains the output NetCDF files for each grid being processed. When grid 0024/0044 is done processing, you will see a file called runNNN/parts/0024/546.psims.nc.
The combined nc file is saved in the runNNN directory. Its name depends on the value of "out_file" in your params file. If you set out_file to "out.psims.apsim75.cfsr.whea", the final combined nc file would be called "out.psims.apsim75.cfsr.whea.nc4".
Below we provide an output example for the sorghum experiment. The figure shows the biomass yield of sorghum across a region in the US at 30 arc-minute resolution. This map is for a given year and genotype under rainfed conditions.
Fast visualisation of the PSIMSV2 outputs (and campaign files) can be done using Panoply. Panoply plots geo-referenced and other arrays from netCDF, HDF, GRIB, and other datasets. Panoply is a cross-platform application that runs on Macintosh, Windows, Linux and other desktop computers.