Elección de parámetros

tabitaCatalan commented 2 years ago

Problema

Necesito una forma mejor de decidir qué parámetros usar. Hasta ahora simplemente estoy usando parámetros γₑ, γᵢ sacados de la manga.

Solución 1: Multiple Model Kalman Filter

[x] Buscar artículos potencialmente útiles
[x] Encontrar una metodología para estimar parámetros fijos: multiple model selection from Simon.
[x] tabitaCatalan/kalman#9
[ ] ~Probar la metodología en el caso sintético.~ <--- esto falló 😞
[ ] Usarlo para obtener parámetros mejores
[ ] ¿Tienen sentido los nuevos parámetros?

Esto falló debido a la naturaleza extremadamente competitiva de MMKF.

Solución 2: Aumentar el estado en el filtro de kalman y ver a qué converge.

[x] Crear un modelo aumentado, donde las tasas son parte del estado
[x] Correr para los datos
[ ] Restricciones de integridad para las tasas en 1/días.
[ ] Ajustar las varianzas iniciales y del modelo
[x] Ver las tasas convergen a un valor fijo con el caso sintético
[ ] Estudiar sensibilidad

Si esto funciona, se puede hacer para el caso real, y además hay que buscar una justificación en la literatura.

tabitaCatalan commented 2 years ago

Revisión bibliográfica

Algunos recursos revisados

📚 Murphy - Machine Learning: A Probabilistic Perspective Chapter 8: State Space Models
http://noiselab.ucsd.edu/ECE228/Murphy_Machine_Learning.pdf

📑 Ghahramani, Hinton - Parameter estimation for linear dynamical systems
http://mlg.eng.cam.ac.uk/zoubin/course04/tr-96-2.pdf
Linear systems have been used extensively in engineering to model and control the behaviour of dynamical systems. In this note, we present the Expectation Maximization (EM) algorithm for estimating the parameters of linear systems (Shumway and Stoffer, 1982). We also point out the relationship between linear dynamical systems, factor analysis, and hidden Markov models.

📑 Song, Xie, Gao, Zhong, Gu, Choi - Maximum likelihood-based extended Kalman filter for COVID-19 prediction Prediction of COVID-19 spread plays a significant role in the epidemiology study and government battles against the epidemic. However, the existing studies on COVID-19 prediction are dominated by constant model parameters, unable to reflect the actual situation of COVID-19 spread. This paper presents a new method for dynamic prediction of COVID-19 spread by considering time-dependent model parameters. This method discretises the susceptible-exposed-infected-recovered-dead (SEIRD) epidemiological model in time domain to construct the nonlinear state-space equation for dynamic estimation of COVID-19 spread. A maximum likelihood estimation theory is established to online estimate time-dependent model parameters. Subsequently, an extended Kalman filter is developed to estimate dynamic COVID-19 spread based on the online estimated model parameters. The proposed method is applied to simulate and analyse the COVID-19 pandemics in China and the United States based on daily reported cases, demonstrating its efficacy in modelling and prediction of COVID-19 spread.

Este artículo presenta un método interesante, pero no es exactamente lo que busco. Usa una técnica de máxima verosimilitud para hacer una estimación online de los parámetros (son variables en el tiempo). Sin embargo, mi objetivo es ajustar los parámetros fijos del modelo. Creo que la idea es útil, tal vez los cálculos me sirvan con algunas modificaciones. Pero hacer eso definitivamente va a tomar más de un día, no creo que logre terminarlo hoy.

📑 Robust and efficient parameter estimation in dynamic models of biological systems https://bmcsystbiol.biomedcentral.com/articles/10.1186/s12918-015-0219-2 Dynamic modelling provides a systematic framework to understand function in biological systems. Parameter estimation in nonlinear dynamic models remains a very challenging inverse problem due to its nonconvexity and ill-conditioning. Associated issues like overfitting and local solutions are usually not properly addressed in the systems biology literature despite their importance.

Here we present a method for robust and efficient parameter estimation which uses two main strategies to surmount the aforementioned difficulties: (i) efficient global optimization to deal with nonconvexity, and (ii) proper regularization methods to handle ill-conditioning. In the case of regularization, we present a detailed critical comparison of methods and guidelines for properly tuning them. Further, we show how regularized estimations ensure the best trade-offs between bias and variance, reducing overfitting, and allowing the incorporation of prior knowledge in a systematic way.

📓 18.337J/6.338J: Parallel Computing and Scientific Machine Learning Lecture 16: Probabilistic Programming From Optimization to Probabilistic Programming https://mitmath.github.io/18337/lecture16/probabilistic_programming All of our previous discussions lived in a deterministic world. Not this one. Here we turn to a probabilistic view and allow programs to have random variables. Forward simulation of a random program is seen to be simple through Monte Carlo sampling. However, parameter estimation is now much more involved, since in this case we need to estimate not just values but probability distributions. It turns out that Bayes' rule gives a framework for performing such estimations. We see that classical parameter estimation falls out as a maximization of probability with the "simplest" form of distributions, and thus this gives a nice generalization even to standard parameter estimation and justifies the use of L2 loss functions and regularization (as a perturbation by a prior). Next, we turn to estimating the distributions, which we see is possible for small problems using Metropolis Hastings, but for larger problems we develop Hamiltonian Monte Carlo. It turns out that Hamiltonian Monte Carlo has strong ties to both ODEs and differentiable programming: it is defined as solving ODEs which arise from a Hamiltonian, and derivatives of the likelihood are required, which is essentially the same idea as derivatives of cost functions! We then describe an alternative approach: Automatic Differentiation Variational Inference (ADVI), which once again is using the tools of differentiable programming to estimate distributions of probabilistic programs.

Recursos elegidos

⭐📚 Simon - Optimal State Estimation: Kalman, H∞ and Nonlinear Approaches
Chapter 10: Additional topics in Kalman filtering
10.2 Multiple-model estimation
Este ha sido el libro que más he usado al trabajar con filtro de kalman, y es también tiene una sección para selección de parámetros.

📚 Peng - A Very Short Course on Time Series Analysis Chapter 6.2 Maximum likelihood with the Kalman Filter https://bookdown.org/rdpeng/timeseriesbook/maximum-likelihood-with-the-kalman-filter.html
Creo que esto es lo que más me sirve hasta ahora. Sería maravilloso tener más bibliografía al respecto.

tabitaCatalan commented 2 years ago

Luego de completar tabitaCatalan/kalman#9 es hora de seguir trabajando aquí. Quedó pendiente lo siguiente:

[ ] Revisar cuál es la forma correcta de manejar el ruido al pasar de un modelo continuo a uno discreto.

Los pasos a seguir para probar la metodología en un caso sintético (este debería ser el definitivo)

[x] Elegir datos de dos comunas (datos de movilidad).
[x] Diseñar un control para cada una de ellas.
[x] Correr el modelo directo con un conjunto de parámetros elegido.
[x] Obtener observaciones de los casos acumulados.
[x] Elegir los parámetros dinámicos y del filtro a estimar
[ ] Dar distribuciones a los parámetros
[x] Crear un MMKF adecuado
[ ] Obtener un conjunto de posibles parámetros con sus respectivas priors
[x] Correr MMKF con esos parámetros
[x] Obtener una estimación de parámetros
[x] Comprobar si se parece o no a la inicial usada originalmente

tabitaCatalan commented 2 years ago

Parámetros a estimar

Los parámetros dinámicos a estimar serán:

julia> p_real
   γₑ => 0.19607843137254904
   γᵢ => 0.1388888888888889
 N[1] => 157769.9925628683 # este es conocido
 N[2] => 334859.65598120925 # este es conocido 
 β[1] => 1.0 # este es conocido
 β[2] => 60.0

Además de eso, necesito elegir los tamaños de las matrices de dispersión del ruido (6 valores para el estado y n para las observaciones). También podría estimar la matriz de covarianza inicial (serían 6 parámetros más). Prefiero no estimar las condiciones iniciales, usualmente se pueden corregir al usar smoother (cosa que haría con los parámetros definitivos).

Distribuciones de los parámetros

Las distribuciones de los parámetros dinámicos deberían elegirse de acuerdo a la literatura, que debo revisar. Los otros los veo más difíciles, pero podría dar rangos razonables y usar alguna distribución tipo normal pero de cola pesada.

tabitaCatalan commented 2 years ago

Problemas con Multiple Model Kalman Filter

Confirmé un detalle con este método que ya era advertido en este artículo:

📑 Akca, Efe - Multiple Model Kalman and Particle Filters and Applications: A Survey https://www.sciencedirect.com/science/article/pii/S2405896319300977

es demasiado competitivo, y tiende a darle toda la probabilidad a un parámetro, en lugar de elegir una combinación.

Además los resultados son muy poco robustos. Ambas imágenes están obtenidas a partir de los mismos parámetros y condiciones iniciales, y solo por el ruido del algoritmo se obtienen resultados totalmente diferentes.

tabitaCatalan commented 2 years ago

Solución 2: Aumentar el estado en el filtro de kalman y ver a qué converge.

Al menos en el caso sintético estoy obteniendo resultados esperanzadores, se mantiene la observabilidad del filtro y logra ajustar la estimación de los valores, aunque no parece ser una estimación súper buena. Aún tengo que ajustar el filtro.

tabitaCatalan commented 2 years ago

Después de hacer que tanto gamma_e como gamma_i sean móviles, se logran resultados mucho mejores con el filtro que antes , aunque aún hay que ajustar los parámetros. Las estimaciones del control aún dejan mucho que desear. Falta agregar el valor del beta exterior.

tabitaCatalan / CovidMTK