Thomas-Moore-Creative commented 3 years ago

There are a few possible approaches - all will potentially change how an "ensemble mean" for S2 might be interpreted under different skill assessments

Example directory structure from: /g/data/ux62/access-s2/hindcast/raw_model/ocean/hc300/monthly/e01

-rw-rw----+ 1 gay548 ux62  36517683 Nov 23  2020 mo_hc300_20181006_e01.nc
-rw-rw----+ 1 gay548 ux62  36539989 Nov 23  2020 mo_hc300_20181011_e01.nc
-rw-rw----+ 1 gay548 ux62  36541481 Nov 23  2020 mo_hc300_20181016_e01.nc
-rw-rw----+ 1 gay548 ux62  36556174 Nov 23  2020 mo_hc300_20181021_e01.nc
-rw-rw----+ 1 gay548 ux62  36551235 Apr 19 23:33 mo_hc300_20181024_e01.nc
-rw-rw----+ 1 gay548 ux62  36559196 May  7 15:49 mo_hc300_20181025_e01.nc
-rw-rw----+ 1 gay548 ux62  36553052 Nov 24  2020 mo_hc300_20181026_e01.nc
-rw-rw----+ 1 gay548 ux62  39134292 May  7 15:36 mo_hc300_20181027_e01.nc
-rw-rw----+ 1 gay548 ux62  39139681 Apr 14 19:21 mo_hc300_20181028_e01.nc
-rw-rw----+ 1 gay548 ux62  39138573 Apr 23 14:48 mo_hc300_20181029_e01.nc
-rw-rw----+ 1 gay548 ux62  39138792 Apr 14 19:37 mo_hc300_20181030_e01.nc
-rw-rw----+ 1 gay548 ux62  39140125 Feb 13 00:08 mo_hc300_20181031_e01.nc
-rw-rw----+ 1 gay548 ux62  39144075 Nov 23  2020 mo_hc300_20181101_e01.nc
-rw-rw----+ 1 gay548 ux62  36538575 Nov 23  2020 mo_hc300_20181106_e01.nc
-rw-rw----+ 1 gay548 ux62  36534441 Nov 23  2020 mo_hc300_20181111_e01.nc
-rw-rw----+ 1 gay548 ux62  36544617 Nov 23  2020 mo_hc300_20181116_e01.nc
-rw-rw----+ 1 gay548 ux62  36544897 Nov 23  2020 mo_hc300_20181121_e01.nc
-rw-rw----+ 1 gay548 ux62  36556693 Apr 14 06:05 mo_hc300_20181123_e01.nc
-rw-rw----+ 1 gay548 ux62  36554692 Apr 23 15:06 mo_hc300_20181124_e01.nc
-rw-rw----+ 1 gay548 ux62  36550741 May 14 17:16 mo_hc300_20181125_e01.nc
-rw-rw----+ 1 gay548 ux62  36566423 Nov 23  2020 mo_hc300_20181126_e01.nc
-rw-rw----+ 1 gay548 ux62  39144992 May 14 17:11 mo_hc300_20181127_e01.nc
-rw-rw----+ 1 gay548 ux62  39154377 Apr 23 14:27 mo_hc300_20181128_e01.nc
-rw-rw----+ 1 gay548 ux62  39150218 Apr 23 14:46 mo_hc300_20181129_e01.nc
-rw-rw----+ 1 gay548 ux62  39145038 Apr 14 19:23 mo_hc300_20181130_e01.nc
-rw-rw----+ 1 gay548 ux62  39145499 Nov 23  2020 mo_hc300_20181201_e01.nc

File naming: mo_variable_YYYYMMDD_ensemble.nc

Consider coding up the following for comparison?

Simple day 1 = Average just the three ensembles from mo_variable_YYYYMM01 into an emean product
Simple day 16 = Average just the three ensembles from mo_variable_YYYYMM16 into an emean product
Full day 1 = Average across all three ensembles and mo_variable_YYYYMM21/22/23/24-01 taking into account shifting the lead times for files with dates 21/22/23/24 - 28/29/30/31
Weighted full day 1 = same as Full day 1 but with weights applied - ??? what weights and why ???

Thomas-Moore-Creative commented 3 years ago

Next, go away and explore if BOM colleagues have documented code for the generation of ensemble means from ACCESS-S2 lagged ensembles.

Thomas-Moore-Creative commented 2 years ago

Help from Grant Smith @ BoM

two functions, one to get a climatology (get_custom_clim_s2_from_ens), one to get one hindcast startdate data(get_emn_from_ens_s2). Then I subtract whats returned for the climatology from what is returned for the data to get the anomaly.

data_loc='/g/data/ux62/access-s2/hindcast/raw_model/ocean/hc300/monthly/'
m,start_day is the month and day of interest
start_year,end_year is your hindcast period. For all of S2 use 1981 and 2018
t is timestep
limits=if global use [0, 1442, 0, 1021]
first_letter='m'
fn_var and nc_var is 'hc300'
num_ens=3
tl_days=9

def get_custom_clim_s2_from_ens(data_loc,m,start_year,end_year,start_day,t,limits,first_letter,fn_var,nc_var,num_ens,tl_days):
    print('clim',m,start_day,t)
    #Hindcast period and ensemble number hard coded
    tls=[]
    for tl in range(0,tl_days):
     yrs=[]
     for y in range(start_year,end_year+1):
      es=[]
      for e in range(1,num_ens+1):
        date_stamp=datetime.date(y,m,start_day)
        if y == 1981 and m == 1:
            date_stamp=date_stamp
        else:
            date_stamp=date_stamp-datetime.timedelta(days=tl)
        file=data_loc+'/e'+str(e).zfill(2)+'/'+first_letter+'o_'+fn_var+'_'+date_stamp.strftime("%Y%m%d")+'_e'+str(e).zfill(2)+'.nc'
        print(file)
        try:
          cdf=Dataset(file,'r')
          data=cdf.variables[nc_var][t,0,limits[2]:limits[3]+1,limits[0]:limits[1]+1]
          cdf.close()
          es.append(data)
        except ValueError:
          cdf=Dataset(file,'r')
          data=cdf.variables[nc_var][t,limits[2]:limits[3]+1,limits[0]:limits[1]+1]
          cdf.close()
          es.append(data)
      es=np.nanmean(es,axis=0)
      yrs.append(es)
     tls.append(np.nanmean(yrs,axis=0))
     print "tls=",np.shape(tls)
    return np.array(tls)

data_loc='/g/data/ux62/access-s2/hindcast/raw_model/ocean/hc300/monthly/'
y,m,start_day,t are the start date and time step of interest
limits=if global use [0, 1442, 0, 1021]
first_letter='m'
fn_var and nc_var is 'hc300'
num_ens=3
tl_days=9

def get_emn_from_ens_s2(data_loc,y,m,start_day,t,limits,first_letter,fn_var,nc_var,num_ens,tl_days):
    #Hindcast period and ensemble number hard coded
    tls=[]
    for tl in range(0,tl_days):
      ens=[]

      for e in range(1,num_ens+1):

        date_stamp=datetime.date(y,m,start_day)
        date_stamp=date_stamp-datetime.timedelta(days=tl)
        file=data_loc+'/e'+str(e).zfill(2)+'/'+first_letter+'o_'+fn_var+'_'+date_stamp.strftime("%Y%m%d")+'_e'+str(e).zfill(2)+'.nc'
        print(file)

        cdf=Dataset(file, 'r')
        data=cdf.variables[nc_var][t,0,limits[2]:limits[3]+1,limits[0]:limits[1]+1]
        cdf.close()
        ens.append(data)
      tls.append(np.nanmean(ens,axis=0))
    mask=np.ma.getmask(data)
    return tls,mask

Thomas-Moore-Creative / NCI-ACCESS-S2-ARD

Generate monthly hindcast ensemble mean products from S2 lagged ensemble #3

There are a few possible approaches - all will potentially change how an "ensemble mean" for S2 might be interpreted under different skill assessments

Help from Grant Smith @ BoM