Closed capelastegui closed 5 years ago
The following code implements residual block bootstrapping:
def get_block_size(s_date):
s_date=pd.Series(s_date)
min_date_delta = s_date.diff().min()
# Default: not a date or unknown freq
l_min=4
l_max=1000
if isinstance(min_date_delta,pd.Timedelta):
# Yearly or quarterly data
if min_date_delta >= pd.Timedelta(90,'d'):
l_min=4
l_max=8
# Monthly data
elif min_date_delta >= pd.Timedelta(28,'d'):
l_min=12
l_max=24
# Weekly data
elif min_date_delta >= pd.Timedelta(7,'d'):
l_min=4
l_max=104
# Daily data
elif min_date_delta >= pd.Timedelta(7,'d'):
l_min=7
l_max=712
l_initial = np.ceil(s_date.index.size*0.2)
l = max(min(l_initial, l_max), l_min)
return int(l)
def get_residuals_sample_by_block(s_residuals, series_length, block_size):
# Number of sampling blocks per experiment: add 2 blocks around edges
n_blocks = int(np.ceil(series_length/block_size)+2)
# Index of short version of residuals series
#- excludes block_size samples at the end
index_short = s_residuals.index[0:s_residuals.index.size-block_size]
# Random samples of index_short - use to sample segments of s_residuals
a_index_sample = pd.Series(index_short).sample(
n_blocks * n, replace=True).values
# Map index_short to segments from residuals series
s_values_map = pd.Series([s_residuals[i:i+block_size].values for i in index_short])
# Sample residuals series by block-sized segments
s_sample_unfiltered = s_values_map.loc[a_index_sample]
a_sample_unfiltered = s_sample_unfiltered.values.reshape(n,n_blocks)
# For each experiment, start taking values at rnd(0,block_size),
# To avoid always matching block start
a_starting_index = pd.Series(np.arange(0,block_size)).sample(n, replace=True).values
a_sample = np.array([
np.concatenate(a_sample_unfiltered[i])
[starting_index:starting_index+series_length]
for i in range(0,n)
])
a_sample = np.cumsum(a_sample, axis=1)
return a_sample
After testing prediction intervals with the new method, we found that it doesn't really achieve our main goal - narrower prediction intervals. In fact, it can make prediction intervals wider (though probably more accurate) in some scenarios. We won't be adding this feature for now.
Current implementation of prediction intervals uses simple bootstrapping by taking individual random samples. This can be improved by using block bootstrapping, as described here: https://otexts.com/fpp2/bootstrap.html