tapios / risk-networks

Code for risk networks: a blend of compartmental models, graphs, data assimilation and semi-supervised learning
Other
2 stars 2 forks source link

Is the Ensemble Kalman updating states correctly? #95

Closed odunbar closed 4 years ago

odunbar commented 4 years ago

I input 2 uniform(0,1) states for 10 ensemble members each. No model parameters need updating, just the state.

We have true data of 1.0, variance of 1e-9 -> essentially certain data.

I expect therefore, that the output state to be an ensemble very close to 1 with small variance. Instead i get an ensemble almost identical to the input.

This snippet is the MWE (github doesn't allow me to attach .py file here):

import os, sys; sys.path.append(os.path.join("..")) import numpy as np import copy from epiforecast.ensemble_adjusted_kalman_filter import EnsembleAdjustedKalmanFilter

np.random.seed(999111999) ensemble_size = 10

data

truth = np.array([1.0,1.0]) var = np.array([1e-9,1e-9]) cov=np.diag(var)

generate ensemble_state

ensemble_state=np.random.random(size=(ensemble_size,2))

pass in 'no' model parameters

ensemble_transition_rates = np.empty((ensemble_size, 0), dtype=float) ensemble_transmission_rate = np.empty((ensemble_size, 0), dtype=float) eakf=EnsembleAdjustedKalmanFilter() prev_ensemble_state = copy.deepcopy(ensemble_state) (ensemble_state[:,:], new_ensemble_transition_rates, new_ensemble_transmission_rate ) = eakf.update(ensemble_state[:,:], ensemble_transition_rates, ensemble_transmission_rate, truth, cov) print(np.mean(prev_ensemble_state,axis=0)) print(" ") print(np.mean(ensemble_state,axis=0))

odunbar commented 4 years ago

Myself and Alfredo have found the culprit! The EAKF uses a logit transform of the data (with bounds). The cov is also subject to the logit transform.

Through this transform the logit(truth) = 20.72... but the logit(cov) = [ [1e9,0][0,1e9] ] which means the data is essentially meaningless

A quick fix here is to never set states = 0 or 1, 0.99 or 0.01 is fine