Open YANG11SD opened 1 year ago
My Greetings,
Basically to create a map, you perform some calculation to collapse the time dimension. For the flash drought (FD) frequency, I generally remove points that identify multiple FD (for the figure, I only want to count 1 FD in each year), then I sum over the time axis to get the total number of years that experienced FD (you can also divide by the total number of years in the dataset to get the percentage of years in which FD was found). In terms of calculating FD for multiple dimensions, I have a function (attached at the bottom, it is christian_fd(), most of the remaining functions are support functions it uses) that will take in a multidimensional array of SESR and identify FD for each pentad and grid point in that array. I also added a function I've been using for making the frequency maps, so you can see how that works (the function is display_fd_climatology()).
I will also add a note here, this code is quite old (I had even forgotten this repository is here). This project was eventually shelved, as the GFS was not able to represent surface evaporation well. The project that took over from this was moved to a new repository that better reflects the focus of that project (decomposition of FD components), and is located here: FD_Component_Decomposition. The code I've attached here is from a newer project (not yet published) and designed to streamline the FD identification process more and to make it a little easier to use.
Lastly on using drought patches, drought patches is another approach for FD identification, especially useful for looking at specific events or ensuring the FD spans a large enough area. However I have not actually used them myself, and do not know how to implement drought patches in the code (it is a problem I've run into before).
My hopes this is all of use.
Sincerely, Stuart Edris
`import numpy as np from scipy import stats from scipy import interpolate from scipy import signal from datetime import datetime, timedelta import matplotlib.pyplot as plt import matplotlib.colors as mcolors from matplotlib import colorbar as mcolorbar import cartopy.crs as ccrs import caropy.feature as cfeature import cartopy.mpl.ticker as cticker import cartopy.io.shapereader as shpreader
##############################################
def collect_climatology(X, dates, start_year, end_year): ''' Extract data between the beginning and ending years.
Inputs:
:param X: Input 3D data, in time x lat x lon format
:param dates: Array of datetimes corresponding to the time axis of X
:param start_year: The beginning year in the interval being searched
:param end_year: The ending year in the interval being searched
Outputs:
:param X_climo: X between the specified start and end dates
'''
# Turn the start and end years into datetimes
begin_date = datetime(start_year, 1, 1)
end_date = datetime(end_year, 12, 31)
# Determine all the points between the start and end points
ind = np.where( (dates >= begin_date) & (dates <= end_date) )[0]
if len(X.shape) < 3:
X_climo = X[ind]
else:
X_climo = X[ind,:,:]
return X_climo
##############################################
def calculate_climatology(X, pentad = True): ''' Calculates the climatological mean and standard deviation of gridded data. Climatological data is calculated for all grid points and for all timestamps in the year.
Inputs:
:param X: 3D variable whose mean and standard deviation will be calculated.
:param pentad: Boolean value giving if the time scale of X is
pentad (5 day average) or daily.
Outputs:
:param clim_mean: Calculated mean of X for each day/pentad and grid point.
clim_mean as the same spatial dimensions as X and 365 (73)
temporal dimension for daily (pentad) data.
:param clim_std: Calculated standard deviation for each day/pentad and grid point.
clim_std as the same spatial dimensions as X and 365 (73)
temporal dimension for daily (pentad) data.
'''
# Obtain the dimensions of the variable
if len(X.shape) < 3:
T = X.size
else:
T, I, J = X.shape
# Count the number of years
if pentad is True:
year_len = int(365/5)
else:
year_len = int(365)
N_years = int(np.ceil(T/year_len))
# Create a variable for each day, assumed starting at Jan 1 and no
# leap years (i.e., each year is only 365 days each)
day = np.ones((T)) * np.nan
n = 0
for i in range(1, N_years+1):
if i >= N_years:
day[n:T+1] = np.arange(1, len(day[n:T+1])+1)
else:
day[n:n+year_len] = np.arange(1, year_len+1)
n = n + year_len
# Initialize the climatological mean and standard deviation variables
if len(X.shape) < 3:
clim_mean = np.ones((year_len)) * np.nan
clim_std = np.ones((year_len)) * np.nan
else:
clim_mean = np.ones((year_len, I, J)) * np.nan
clim_std = np.ones((year_len, I, J)) * np.nan
# Calculate the mean and standard deviation for each day and at each grid
# point
for i in range(1, year_len+1):
ind = np.where(i == day)[0]
if len(X.shape) < 3:
clim_mean[i-1] = np.nanmean(X[ind], axis = 0)
clim_std[i-1] = np.nanstd(X[ind], axis = 0)
else:
clim_mean[i-1,:,:] = np.nanmean(X[ind,:,:], axis = 0)
clim_std[i-1,:,:] = np.nanstd(X[ind,:,:], axis = 0)
return clim_mean, clim_std
##############################################
def apply_mask(data, mask): ''' Turn sea points into NaNs based on a land-sea mask where 0 is sea and 1 is land
Inputs:
:param data: Data to be masked
:param mask: Land-sea mask. Must have the same spatial dimensions as data
Outputs:
:param data_mask: Data with all labeled sea grids at NaN
'''
T, I, J = data.shape
data_mask = np.ones((T, I, J)) * np.nan
for t in range(T):
data_mask[t,:,:] = np.where(mask == 1, data[t,:,:], np.nan)
return data_mask
##############################################
def calculate_sesr(et, pet, dates, mask, start_year = 1990, end_year = 2020, years = None, months = None, days = None): ''' Calculate the standardized evaporative stress ratio (SESR) from ET and PET.
Full details on SESR can be found in Christian et al. 2019 (for SESR): https://doi.org/10.1175/JHM-D-18-0198.1.
Inputs:
:param et: Input evapotranspiration (ET) data
:param pet: Input potential evapotranspiration (PET) data. Should be in the same units as et
:param dates: Array of datetimes corresponding to the timestamps in et and pet
:param mask: Land-sea mask for the et and pet variables
:param start_year: The start year in the climatological period used
:param end_year: The last year in the climatological period used
:param years: Array of intergers corresponding to the dates.year. If None, it is made from dates
:param months: Array of intergers corresponding to the dates.month. If None, it is made from dates
:param days: Array of intergers corresponding to the dates.day. If None, it is made from dates
Outputs:
:param sesr: Calculate SESR, has the same shape and size as et/pet
'''
# Make the years, months, and/or days variables?
if years == None:
years = np.array([date.year for date in dates])
if months == None:
months = np.array([date.month for date in dates])
if days == None:
days = np.array([date.day for date in dates])
# Obtain the evaporative stress ratio (ESR); the ratio of ET to PET
esr = et/pet
# Remove values that exceed a certain limit as they are likely an error
esr[esr < 0] = np.nan
esr[esr > 3] = np.nan
# Collect the climatological data for the ESR
esr_climo = collect_climatology(esr, dates, start_year = start_year, end_year = end_year)
# Determine the climatological mean and standard deviations of ESR
esr_mean, esr_std = calculate_climatology(esr_climo, pentad = True)
# Find the time stamps for a singular year
ind = np.where(years == 1999)[0] # Note, any non-leap year will do
one_year = dates[ind]
# Calculate SESR; it is the standardized ESR
T, I, J = esr.shape
sesr = np.ones((T, I, J)) * np.nan
for n, date in enumerate(one_year):
ind = np.where( (date.month == months) & (date.day == days) )[0]
for t in ind:
sesr[t,:,:] = (esr[t,:,:] - esr_mean[n,:,:])/esr_std[n,:,:]
# Remove any unrealistic points
sesr = np.where(sesr < -5, -5, sesr)
sesr = np.where(sesr > 5, 5, sesr)
# Remove any sea data points
# This can be commented out if there is no land-sea mask
sesr = apply_mask(sesr, mask)
return sesr
def christian_fd(sesr, mask, dates, years = None, months = None, days = None): ''' Calculate the flash drought using an updated version of the method described in Christian et al. 2019 (https://doi.org/10.1175/JHM-D-18-0198.1). This method uses the evaporative stress ratio (SESR) to identify flash drought. Updates to the method are details (for LSWI) in Christian et al. 2022 (https://doi.org/10.1016%2Fj.rsase.2022.100770).
Inputs:
:param sesr: Input SESR values, time x lat x lon format
:param mask: Land-sea mask for the et and pet variables
:param dates: Array of datetimes corresponding to the timestamps in et and pet
:param mask: Land-sea mask for the et and pet variables. A a value of none can be provided to not use a mask
:param years: Array of intergers corresponding to the dates.year. If None, it is made from dates
:param months: Array of intergers corresponding to the dates.month. If None, it is made from dates
:param days: Array of intergers corresponding to the dates.day. If None, it is made from dates
Outputs:
:param fd: The identified flash drought for all grid points and time steps in sesr
'''
# Make the years, months, and/or days variables?
if years == None:
years = np.array([date.year for date in dates])
if months == None:
months = np.array([date.month for date in dates])
if days == None:
days = np.array([date.day for date in dates])
# Is a mask provided?
mask_provided = mask is not None
# Initialize some variables
T, I, J = sesr.shape
sesr_inter = np.ones((T, I, J)) * np.nan
sesr_filt = np.ones((T, I, J)) * np.nan
sesr2d = sesr.reshape(T, I*J, order = 'F')
sesr_inter2d = sesr_inter.reshape(T, I*J, order = 'F')
sesr_filt2d = sesr_filt.reshape(T, I*J, order = 'F')
if mask_provided:
mask2d = mask.reshape(I*J, order = 'F')
x = np.arange(-6.5, 6.5, (13/T))[:-1] # a variable covering the range of all SESR values with 1 entry for each time step
print(x.size, T)
# Parameters for the filter
WinLength = 21 # Window length of 21 pentads
PolyOrder = 4
# Perform a basic linear interpolation for NaN values and apply a SG filter
print('Applying interpolation and Savitzky-Golay filter to SESR')
for ij in range(I*J):
if mask_provided:
if mask2d[ij] == 0:
continue
else:
pass
# Perform a linear interpolation to remove NaNs
ind = np.isfinite(sesr2d[:,ij])
if np.nansum(ind) == 0:
continue
else:
pass
ind = np.where(ind == True)[0]
interp_func = interpolate.interp1d(x[ind], sesr2d[ind,ij], kind = 'linear', fill_value = 'extrapolate')
sesr_inter2d[:,ij] = interp_func(x)
# Apply the Savitzky-Golay filter to the interpolated SESR data
sesr_filt2d[:,ij] = signal.savgol_filter(sesr_inter2d[:,ij], WinLength, PolyOrder)
# Reorder SESR back to 3D data
sesr_filt = sesr_filt2d.reshape(T, I, J, order = 'F')
# Determine the change in SESR
print('Calculating the change in SESR')
delta_sesr = np.ones((T, I, J)) * np.nan
delta_sesr[1:,:,:] = sesr_filt[1:,:,:] - sesr_filt[:-1,:,:]
# Begin the flash drought calculations
print('Identifying flash drought')
fd = np.ones((T, I, J)) * np.nan
fd2d = fd.reshape(T, I*J, order = 'F')
dsesr2d = delta_sesr.reshape(T, I*J, order = 'F')
dsesr_percentile = 25
sesr_percentile = 20
min_change = timedelta(days = 30)
start_date = dates[-1]
for ij in range(I*J):
if mask_provided:
if mask2d[ij] == 0:
continue
else:
pass
start_date = dates[-1]
for t in range(T):
ind = np.where( (dates[t].month == months) & (dates[t].day == days) )[0]
# Determine the percentiles of dSESR and SESR
ri_crit = np.nanpercentile(dsesr2d[ind,ij], dsesr_percentile)
dc_crit = np.nanpercentile(sesr_filt2d[ind,ij], sesr_percentile)
# If start_date != dates[-1], the rapid intensification criteria is satisified
# If the rapid intensification and drought component criteria are satisified (and FD period is 30+ days)
# then FD occurs
if ( (dates[t] - start_date) >= min_change) & (sesr_filt2d[t,ij] <= dc_crit):
fd2d[t,ij] = 1
else:
fd2d[t,ij] = 0
# # If the change in SESR is below the criteria, change the start date of the flash drought
if (dsesr2d[t,ij] <= ri_crit) & (start_date == dates[-1]):
start_date = dates[t]
elif (dsesr2d[t,ij] <= ri_crit) & (start_date != dates[-1]):
pass
else:
start_date = dates[-1]
# Re-order the flash drought back into a 3D array
fd = fd2d.reshape(T, I, J, order = 'F')
print('Done')
return fd
def display_fd_climatology(fd, lat, lon, dates, method, model = 'narr', path = './Figures', grow_season = False, years = None, months = None): ''' Display the climatology of flash drought (percentage of years with flash drought)
Inputs:
:param fd: Input flash drought (FD) data to be plotted, time x lat x lon format
:param lat: Gridded latitude values corresponding to data
:param lon: Gridded longitude values corresponding to data
:param dates: Array of datetimes corresponding to the timestamps in fd
:param method: String describing the method used to calculate the flash drought
:param model: String describing what reanalysis model the data comes from. Used to name the figure
:param path: Path the figure will be saved to
:param grow_season: Boolean indicating whether fd has already been set into growing seasons.
:param years: Array of intergers corresponding to the dates.year. If None, it is made from dates
:param months: Array of intergers corresponding to the dates.month. If None, it is made from dates
'''
# Make the years, months, and/or days variables?
if years == None:
years = np.array([date.year for date in dates])
if months == None:
months = np.array([date.month for date in dates])
### Calcualte the climatology ###
# Initialize variables
T, I, J = fd.shape
all_years = np.unique(years)
ann_fd = np.ones((all_years.size, I, J)) * np.nan
# Calculate the average number of rapid intensifications and flash droughts in a year
for y in range(all_years.size):
if grow_season:
y_ind = np.where( (all_years[y] == years) )[0]
else:
y_ind = np.where( (all_years[y] == years) & ((months >= 4) & (months <= 10)) )[0] # Second set of conditions ensures only growing season values
# Calculate the mean number of flash drought for each year
ann_fd[y,:,:] = np.nanmean(fd[y_ind,:,:], axis = 0)
# Turn nonzero values to 1 (each year gets 1 count to the total)
ann_fd[y,:,:] = np.where(( (ann_fd[y,:,:] == 0) | (np.isnan(ann_fd[y,:,:])) ),
ann_fd[y,:,:], 1) # This changes nonzero and nan (sea) values to 1.
# Calculate the percentage number of years with rapid intensifications and flash droughts
per_ann_fd = np.nansum(ann_fd[:,:,:], axis = 0)/all_years.size
# Turn 0 values into nan
per_ann_fd = np.where(per_ann_fd != 0, per_ann_fd, np.nan)
#### Create the Plot ####
# Set colorbar information
cmin = -20; cmax = 80; cint = 1
clevs = np.arange(-20, cmax + cint, cint)
nlevs = len(clevs)
cmap = plt.get_cmap(name = 'hot_r', lut = nlevs)
# Get the normalized color values
norm = mcolors.Normalize(vmin = 0, vmax = cmax)
# Generate the colors from the orginal color map in range from [0, cmax]
colors = cmap(np.linspace(1 - (cmax - 0)/(cmax - cmin), 1, cmap.N)) ### Note, in the event cmin and cmax share the same sign, 1 - (cmax - cmin)/cmax should be used
colors[:4,:] = np.array([1., 1., 1., 1.]) # Change the value of 0 to white
# Create a new colorbar cut from the colors in range [0, cmax.]
ColorMap = mcolors.LinearSegmentedColormap.from_list('cut_hot_r', colors)
colorsNew = cmap(np.linspace(0, 1, cmap.N))
colorsNew[abs(cmin)-1:abs(cmin)+1, :] = np.array([1., 1., 1., 1.]) # Change the value of 0 in the plotted colormap to white
cmap = mcolors.LinearSegmentedColormap.from_list('hot_r', colorsNew)
# Shapefile information
# ShapeName = 'Admin_1_states_provinces_lakes_shp'
ShapeName = 'admin_0_countries'
CountriesSHP = shpreader.natural_earth(resolution = '110m', category = 'cultural', name = ShapeName)
CountriesReader = shpreader.Reader(CountriesSHP)
USGeom = [country.geometry for country in CountriesReader.records() if country.attributes['NAME'] == 'United States of America']
NonUSGeom = [country.geometry for country in CountriesReader.records() if country.attributes['NAME'] != 'United States of America']
# Lonitude and latitude tick information
lat_int = 10
lon_int = 20
LatLabel = np.arange(-90, 90, lat_int)
LonLabel = np.arange(-180, 180, lon_int)
LonFormatter = cticker.LongitudeFormatter()
LatFormatter = cticker.LatitudeFormatter()
# Projection information
data_proj = ccrs.PlateCarree()
fig_proj = ccrs.PlateCarree()
# Create the plots
fig = plt.figure(figsize = [12, 10])
# Flash Drought plot
ax = fig.add_subplot(1, 1, 1, projection = fig_proj)
# Set the flash drought title
ax.set_title('Percent of Years from %s - %s with %s Flash Drought'%(all_years[0], all_years[-1], method), size = 18)
# Ocean and non-U.S. countries covers and "masks" data outside the U.S.
ax.add_feature(cfeature.OCEAN, facecolor = 'white', edgecolor = 'white', zorder = 2)
ax.add_feature(cfeature.STATES)
ax.add_geometries(USGeom, crs = fig_proj, facecolor = 'none', edgecolor = 'black', zorder = 3)
ax.add_geometries(NonUSGeom, crs = fig_proj, facecolor = 'white', edgecolor = 'white', zorder = 2)
# Adjust the ticks
ax.set_xticks(LonLabel, crs = ccrs.PlateCarree())
ax.set_yticks(LatLabel, crs = ccrs.PlateCarree())
ax.set_yticklabels(LatLabel, fontsize = 18)
ax.set_xticklabels(LonLabel, fontsize = 18)
ax.xaxis.set_major_formatter(LonFormatter)
ax.yaxis.set_major_formatter(LatFormatter)
# Plot the flash drought data
cs = ax.pcolormesh(lon, lat, per_ann_fd*100, vmin = cmin, vmax = cmax,
cmap = cmap, transform = data_proj, zorder = 1)
# Set the map extent to the U.S.
ax.set_extent([-130, -65, 23.5, 48.5])
# Set the colorbar size and location
cbax = fig.add_axes([0.915, 0.29, 0.025, 0.425])
# Create the colorbar
cbar = mcolorbar.ColorbarBase(cbax, cmap = ColorMap, norm = norm, orientation = 'vertical')
# Set the colorbar label
cbar.ax.set_ylabel('% of years with Flash Drought', fontsize = 18)
# Set the colorbar ticks
cbar.set_ticks(np.arange(0, 90, 10))
cbar.ax.set_yticklabels(np.arange(0, 90, 10), fontsize = 16)
# Show the figure
plt.show(block = False)
# Save the figure
filename = '%s_%s_flash_drought_climatology.png'%(model, method)
plt.savefig('%s/%s'%(path, filename), bbox_inches = 'tight')
plt.show(block = False)`
Dear Prof. Edris, Thank you very much for your reply. If I have any more questions after that, I will continue to ask you for advice, thank you for your full help. Finally, I wish you good health and academic success. Sincerely, yours YANG
Good afternoon Professor, I have now identified flash droughts, storing them as a piece of .npz data that identifies the flash drought region as 1. But I am having trouble with the identification of flash drought frequencies in multiple dimensions. How do I come up with a graph on flash drought frequencies. I have read the article "Application of an improved spatio-temporal identification method of flash droughts", which reads as follows: “The marked binary grids are then classified into drought patches by means of the Contiguous Drought Area (CDA) analysis (Corzo Perez et al., 2011; Diaz et al., 2020). In this approach, a two-scan algorithm is applied. The first run is performed to connect the given grid with its 8 neighbors and assign a provisional label to the contiguous grids.” Can you give me some advice? Extremely grateful!