Open chunhsusu opened 1 year ago
Happy to implement this.
I guess for temperature we'll need to implement the option of picking resample('1D').mean()
for indices that require daily mean temperature, resample('1D').max()
for indices that require daily maximum temperature and resample('1D').min()
for indices that require daily minimum temperature?
That's right. Note that for era5 data, they have flipped the order of latitude, and the longitude goes from -180 to 180. I don't think it affects your implementation, but something to worry about when we compare between data sets.
@chunhsusu - One more question. Do you want to calculate indices for the entire globe using the ERA5 data or just the Australian region? (If the latter I can add spatial subsetting functionality.)
@DamienIrving For efficiency, yes I think the functionality should be added. Our evaluation work will only be over the Australian region.
Subsetting option added: https://github.com/AusClimateService/indices/commit/e9a6985d2bacdbffc0db00e8074aeb2340ca9b2a
Hi @DamienIrving, thanks for implementing this, I have some questions.
time_agg
apply for bivariate indices? tp
and mtpr
? ECMWF provides this guide which I have used before to calculate daily rainfall using tp.Hi @ngben , On tp v mtpr. We are using the mtpr data archive in /g/data/rt52/era5/single-levels/reanalysis/mtpr/
, and there the mtpr is defined as "Mean total precipitation rate". For computing daily total, it makes sense.
How is "tp" defined? rt52 does not seem to have it.
Hi @chunhsusu
How is "tp" defined? rt52 does not seem to have it.
tp is defined as total precipitation (https://apps.ecmwf.int/codes/grib/param-db/?id=228)
I guess if using mtpr the time_agg in run_icclim.py would be "mean"?
For tp when using cdo to calculate the daily amount I used cdo daysum
I downloaded hourly tp last year, I'm not sure why rt52 doesn't have it
@ngben:
time_agg
flag twice. For example, diurnal temperature range for ERA5 data would look like the following with the --input_files
, --variable
and --time_agg
flags used twice, the first time with the information needed to create the tmax data and the second time to create the tmin data:/g/data/xv83/dbi599/miniconda3/envs/icclim/bin/python run_icclim.py dtr /g/data/ia39/australian-climate-service/test-data/CORDEX-CMIP6/indices/AUS-gn/none/ECMWF-ERA5/evaluation/r1i1p1f1/none/none/climdex/dtr/dtr_AUS-gn_ECMWF-ERA5_evaluation_r1i1p1f1_year_195901-202112.nc --input_files /g/data/rt52/era5/single-levels/reanalysis/2t/*/*.nc --variable t2m --time_agg max --input_files /g/data/rt52/era5/single-levels/reanalysis/2t/*/*.nc --variable t2m --time_agg min --start_date 1959-01-01 --end_date 2021-12-31 --lon_bnds 111.975 156.275 --lat_bnds -44.525 -9.975 --verbose
--hour_shift
option? What shift needs to be applied? Thanks @DamienIrving
cdo daysum -shifttime,-1hour
Also I'm not sure if it's an issue but ERA5 is in short data format which can cause problems when concatenating as the offsets and scale factors may be different between files. For cdo this requires the use of -b F64
A quick glance through the ECMWF ERA5 wiki suggests that 2t
/t2m
is instantaneous but mx2t
and mn2t
are the maximum/minimum "since previous post-processing" https://confluence.ecmwf.int/display/CKB/ERA5%3A+2+metre+temperature
I think mx2t
and mn2t
would need to be shifted along with tp
but I don't know if mtpr
also needs it.
I've added a --hshift
option that moves the time axis back 1 hour: https://github.com/AusClimateService/indices/commit/692436233bb50ff44046b7c0384096cc0f499a0a
In terms of different scale and offset factors for different files, I'm pretty sure the mask_and_scale=True
argument that is passed toxr.open_mfdataset
and xr.open_dataset
in the script handles that.
Thanks Damien!
For mtpr
it should also be shifted as it's for the previous hour
reanalysis: accumulations are over the hour (the accumulation/processing period) ending at the validity date/time
@ngben @DamienIrving thank you for raising this. I have been using mtpr and 2t in the icclim calculation. It appears that I should use mx2t and mn2t instead. I can re-run icclim on ERA5, i.e., TX based (TN-based) indicators with mx2t (mn2t) and with --hshift. And precip indicator with mtpr with --hshift.
Please advise if that sounds correct?
@ngben @DamienIrving thank you for raising this. I have been using mtpr and 2t in the icclim calculation. It appears that I should use mx2t and mn2t instead. I can re-run icclim on ERA5, i.e., TX based (TN-based) indicators with mx2t (mn2t) and with --hshift. And precip indicator with mtpr with --hshift.
Please advise if that sounds correct?
Perhaps it might be best to use 2t for tx/tn? ERA5 documentation states:
Given this inconsistency in these three parameters on the CDS, we recommend, in general, that the hourly (analysed) "2 metre temperature" be used to construct the minimum and maximum over longer periods, such as a day.
I think --hshift should be used for mtpr
@ngben @DamienIrving thank you for raising this. I have been using mtpr and 2t in the icclim calculation. It appears that I should use mx2t and mn2t instead. I can re-run icclim on ERA5, i.e., TX based (TN-based) indicators with mx2t (mn2t) and with --hshift. And precip indicator with mtpr with --hshift. Please advise if that sounds correct?
Perhaps it might be best to use 2t for tx/tn? ERA5 documentation states:
Given this inconsistency in these three parameters on the CDS, we recommend, in general, that the hourly (analysed) "2 metre temperature" be used to construct the minimum and maximum over longer periods, such as a day.
I think --hshift should be used for mtpr
Thank you Ben, I will follow this.
Request implementation to allow use with ERA5 data in rt52:
/g/data/rt52/era5/single-levels/reanalysis/2t/$year/2t_*.nc
(variable name ist2m
in units of K)/g/data/rt52/era5/single-levels/reanalysis/mtpr/$year/mtpr_*.nc
(variable name ismtpr
in kg m-2 s-1units of )I assume icclim does not handle hourly dat, so an additional argument can be passed to run_icclim.py, which tells xr.dataarray how to preprocess the hourly data first such as resample('1D').mean() etc.