ETAS in R: Resources, Overview, and Tasks Breakdown (Updated with resources and links)

So since there are multiple groups working on the ETAS model in R in the interest of reducing redundancy I would like to provide an overview and a breakdown of the possible sub-problems as I see it that we need to tackle so that we can more efficiently divide up tasks. I wrote up a comment in another thread previously explaining this but I will rehash for better visibility. This is the task as I see it and in my opinion here is how we might break it down.

End Goal: A method of comparison between ETAS and MDA. Either a loss function or an error diagram as in Luen's paper (with prediction accuracy(proportion of earthquakes missed) vs tau (proportion of time alarm is on), the lower the curve the better).

If we want to generate error diagrams in order to do so we need code that finds when alarm is on over the time domain (the level sets of conditional intensity over time) in order to find prediction error (proportion of earthquakes we missed) for a particular tau (proportion of alarm coverage).

Once we can compare models and we have established the benchmark predictions for ETAS to compare against we can get to the real meat of our problem which is modifying the MDA model to beat our benchmark.

Steps we need to take:

Definition:

a. We need to define a domain for the modelling. This consists of a time range, a magnitude range, and a spatial range (drawing a geo-spatial polygon on the map to represent a specific earthquake fault we are modelling).

b. We need to decide which ETAS model to use. There are successive ones starting from the temporal ETAS, space-time ETAS, hierarchical space-time ETAS (HIST-ETAS), more recent extensions(Ogata 2011, Ogata and Zhuang 2006). See papers by Ogata et al. Using the most recent ETAS model would be more difficult to implement but would provide us a stronger adversary to conquer. Using an older version would be easier but runs the risk of fighting a strawman and dilutes the impact of our achievements.

c. We need to decide which package(s) (languages?) to use, and which code we may need to write ourselves. -R ETAS sucks in my opinion, it only provides parameter estimation and generates some plots. The algorithm it uses to fit the model is very inefficient (David-Fletcher-Powell) compared to the most recent one (Expectation Maximization). -R SAPP is great in my opinion, it provides functions for simulation and estimation of conditional intensity, both of which we need in the future. -R PtProcess is more general and has more tools but is less specific to ETAS. -Code from Luen. He has agreed to share his code but requests which specific code do we need, as he has written various code for different things. May or may not be in R.

d. What data are we going to compare on? True data? Simulated data? This effects our inference.

Fitting the model: Use one of the packages or write our own code to estimate the parameters for ETAS for our specified spatio-temporal domain so we can simulate data and compute our alarms. Computational time is a concern here so we may want to write our own implementation of EM or find one online.
(Optional) Simulating earthquake catalogs: As part of Luen's methodology he ran comparisons on data generated from simulations. We could use just the true dataset. We could use permutation or bootstrapping or whatever.
Generating our predictions (alarms) and find prediction accuracy per tau To generate predictions with ETAS Luen used the following rule: Alarm turns on when conditional intensity is above a certain threshold. If we use the same rule R SAPP has a conditional intensity estimation function we can use. We need code to find the level sets of conditional intensity (the times when alarm is on for each level of tau (alarm on proportion)) on the timeline. Once we have the alarms we can find prediction error. Once we have that we can compare ETAS and MDA and generate the error diagram.
Generate error diagrams We could iterate over different levels of alarm coverage tau. We have now found a baseline for comparison against the MDA models we generate.
Iterate MDA to beat ETAS We now have the framework set up to compare the MDA models we generate against ETAS

Resources and links:

-Our project: stark slides more stark slides luen dissertation luen slides luen and stark paper paper on math behind error diagrams(Molchan diagrams)

-ETAS model overview of etas model most recent etas model (Ogata 2011) second most recent etas model (Ogata and Zhuang 2006) etas parameter estimation etas example lombardi paper etas example paper china: geophysical interpretation of the ETAS parameters implementation of the EM algorithm to estimate ETAS parameters

-ETAS R packages PtProcess paper PtProcess package SAPP package ETAS package

stat157 / background

ETAS in R: Resources, Overview, and Tasks Breakdown (Updated with resources and links) #26