X-DataInitiative / tick

Module for statistical learning, with a particular emphasis on time-dependent modelling
https://x-datainitiative.github.io/tick/
BSD 3-Clause "New" or "Revised" License
484 stars 105 forks source link

Usage of HawkesCumulantMatching #431

Closed trouleau closed 4 years ago

trouleau commented 4 years ago

Hello,

I tried simulating long realizations of a multivariate Hawkes process with SimuHawkesExpKernels and learning the parameters with HawkesCumulantMatching but the estimation is not consistent (model.adjacency even have small negative values)., as can be seen in the small toy example below.

from tick.hawkes.simulation import SimuHawkesExpKernels
from tick.hawkes import HawkesCumulantMatching

np.random.seed(123)

# Generate model parameters
n_nodes = 5
adjacency = np.random.binomial(n=1, p=np.log(n_nodes)/n_nodes, size=(n_nodes, n_nodes)) / 2
baseline = np.random.uniform(0.0, 0.1, size=n_nodes)
decays = 1.0

# Simulate the process
hawkes_simu = SimuHawkesExpKernels(adjacency=adjacency, decays=decays,
                                   baseline=baseline, max_jumps=3e5,
                                   verbose=False, seed=123)
hawkes_simu.simulate()

model = HawkesCumulantMatching(
    integration_support=200.0, 
    C=100.0, penalty='elasticnet', elastic_net_ratio=0.95,  
    solver='adam', step=0.01, tol=1e-08, max_iter=1000)
model.fit(events=hawkes_simu.timestamps)

print(adjacency.round(2))  # ground truth
print(model.adjacency.round(2))  # estimation
print(np.max(np.abs(adjacency - model.adjacency)))  # max estimation error

with output:

[[0.5 0.  0.  0.  0.5]
 [0.  0.5 0.5 0.  0. ]
 [0.  0.5 0.  0.  0. ]
 [0.5 0.  0.  0.  0. ]
 [0.  0.5 0.5 0.  0.5]]
[[ 0.54  0.23  0.   -0.    0.28]
 [ 0.    0.24  0.65  0.1   0.04]
 [ 0.    0.39  0.02  0.    0.06]
 [ 0.48 -0.    0.    0.    0.  ]
 [ 0.11  0.72  0.    0.    0.41]]
0.4987121740781347

The code was run with tick==0.6.0.0 and tensorflow==1.15.0.

Is there a problem with this toy example?

Thank you!

Mbompr commented 4 years ago

I don't see anything straight forward.

Few suggestions:

Hope it helps ! Pinging @achab, the author of this algorithm

achab commented 4 years ago

Thanks for pinging me @Mbompr ! Hello @trouleau, we met a few years ago in Zürich !

@Mbompr, your suggestions all make sense, I would go for the last one. The estimation is pretty sensible to integration_support and this support should include the information you'd like to estimate, but shouldn't be too high.

When integration_support is too high, you include a lot of noise in your estimation, which increases the variance of the estimator (this is encoded in the second part of condition 4) of Theorem 3, in this article). The integration_support is the integration support of cumulants' densities: here you simulate Hawkes processes with an exponential decay of 1. Assuming that exponential densities is close to zero for after 5 times the decay, the covariance density and the skewness density should be close to zero respectively after +/- 5 times and +/- 10 times the decay. So I would rather go for integration_support = 20. I acknowledge that this analysis is impossible to do when you don't know the kernels's supports: then you should use an integration_support that roughly corresponds to the characteristic time scale of your system.

Plus, the theorem proves the consistency of the estimator, not the uniform convergence, so the maximum estimation error is too mean for that one.

Hope it helps !

trouleau commented 4 years ago

Hello all, Thanks a lot for your suggestion. I'll tune the integration_support. @achab, I do remember we met a few years ago. Excellent paper and great lib by the way! :)