bashtage / arch

ARCH models in Python
Other
1.32k stars 245 forks source link

My scratch implementation does not match the result for EGARCH #667

Closed xuJ14 closed 1 year ago

xuJ14 commented 1 year ago

I wrote my own EGARCH model via python. When comparing the result from arch_model, they are totally different, both values of parameters and loglikelihood. I Wonder:

  1. if my understanding of principle of EGARCH is right.
  2. if the implemention in arch_model is right.

So,

  1. My understanding of EGARCH(1,1,1) with normal distribution: $$ \begin{aligned} r_t & =\mu+\epsilon_t \ \ln \left(\sigma_t^2\right) & =\omega+\alpha1\left(\left|\frac{\epsilon{t-1}}{\sigma_{t-1}}\right|-\sqrt{\frac{2}{\pi}}\right)+\gamma1 \frac{\epsilon{t-1}}{\sigma_{t-1}}+\beta1 \ln \left(\sigma{t-1}^2\right) \ \epsilon_t & =\sigma_t e_t, \quad et \stackrel{\text { i.i.d. }}{\sim} N(0,1) \end{aligned} $$ The loglikelihood: $$ 1/2*\sum{i=1}^{n}[\ln(2\pi)+\ln(\frac{\epsilon_t^2}{\sigma_t^2})] $$
  1. Here is my implementation via python:
def egarch_likelihood(parameters, data, lnsigma2, out=None):
    mu = parameters[0]
    omega = parameters[1]
    alpha = parameters[2]
    # gamma = parameters[3]
    beta = parameters[3]

    T = len(data)
    eps = data - mu
    for t in range(1, T):
        lnsigma2[t] = omega 
        + alpha * (np.abs(eps[t-1]/np.sqrt(np.exp(lnsigma2[t-1])))-np.sqrt(2/np.pi)) 
        + gamma*eps[t-1]/np.sqrt(np.exp(lnsigma2[t-1])) 
        + beta * lnsigma2[t-1]
    logliks = 0.5*(np.log(2*np.pi) + eps**2/np.exp(lnsigma2))
    loglik = np.sum(logliks)
    return loglik
startingVals = np.array([
    data.mean(),
    np.log(data.var()),
    0.1,
    0.1,
    0.9
])
lnsigma2 = np.ones(len(data)) * np.log(data.var())
args = (data, lnsigma2)
estimates = scipy.optimize.minimize(egarch_likelihood, startingVals, args=args, bounds=bounds,method="SLSQP")

I find out that in package arch_model, the authors seem adding some conditions/boundries to the parameters, which disagrees with the principle of EGARCH.

The result is so different, which really makes me confused. Am I wrong?

Sorry for the massive math script.

bashtage commented 1 year ago

What happens when you simulate a long time series and then estimate parameters? They should be close. If they aren't, this there is an issue with your log-likelihood.

As for the constraints, they are very loose and should not normally bind. They are there to help the minimizer avoid regions of numerical instability.

One thing I notice is that you are failing to initialize the backcast value for lnsigma[0]. Your starting values are also quite bad since the intercept should be of the order (1-beta)*log(var).

I'm going to close as this isn't really an issue in the package. Feel free to continue to post relevant information.