Open ikatsov opened 4 years ago
It's not automatic, but you should be able to get the effects by multiplying the design matrix by the sampled regression weights:
price_effect, price_effect_params = model.components[-1], q_samples_demand_[-5:]
regression_weights = price_effect.params_to_weights(*price_effect_params) # shape [num_samples, num_features]
sampled_effects = tf.einsum('ij,...j->...ij', # shape [num_samples, num_timesteps, num_features]
price_effect.design_matrix.to_dense(),
regression_weights)
effects_dist = tfd.Empirical(sampled_effects, event_ndims=2)
Thanks, Dave! I've implemented a generic helper function that expands regression components into feature-level subcomponents, as shown below. However, there is one issue - the design matrix has to contain features for both training and forecast periods, so it is not clear how to pick the right range. The workaround is a is_forecast flag that prescribes which range to pick, but it is clearly a hack. Do you have any suggestions on how to handle this in a cleaner way?
component_dists = sts.decompose_by_component(
model,
observed_time_series=observed_time_series,
parameter_samples=q_samples)
forecast_component_dists = sts.decompose_forecast_by_component(
model,
forecast_dist=forecast_dist,
parameter_samples=q_samples)
def regression_dist(regression_component, regression_params):
regression_weights = regression_component.params_to_weights(*regression_params) # shape [num_samples, num_features]
sampled_effects = tf.einsum('ij,...j->...ij', # shape [num_samples, num_timesteps, num_features]
regression_component.design_matrix.to_dense(),
regression_weights)
effects_dist = tfd.Empirical(sampled_effects, event_ndims=2)
return effects_dist
def component_effects(component_dists, parameter_samples, is_forecast):
means, stddevs = collections.OrderedDict(), collections.OrderedDict()
for k, dist in component_dists.items():
means[k.name] = dist.mean()
stddevs[k.name] = dist.stddev()
if isinstance(k, sts.SparseLinearRegression) or isinstance(k, sts.LinearRegression):
params = {p_name:v for (p_name,v) in parameter_samples.items() if p_name.startswith(k.name)}
features_dist = regression_dist(k, params.values())
# hack: take first samples for decomposition, last samples for forecast
n = dist.event_shape[0]
def take_time_range(w):
return w[:n] if is_forecast else w[-n:]
for i, w in enumerate(tf.transpose(features_dist.mean())):
means[f'{k.name}{i}'] = take_time_range(w)
for i, w in enumerate(tf.transpose(features_dist.stddev())):
stddevs[f'{k.name}{i}'] = take_time_range(w)
return means, stddevs
component_means_, component_stddevs_ = component_effects(component_dists, q_samples, False)
forecast_component_means_, forecast_component_stddevs_ = component_effects(forecast_component_dists, q_samples, True)
Tensortflow STS (Structural Time Series) allows to decompose the time series or forecast using
sts.decompose_forecast_by_component
method. For a model where one of the components is a LinearRegresion (or SparseLinearRegression) with multiple features, the decomposition returns only one series for this component - is there a way to break it down into effects of individual features?Example: