Closed fermionportal closed 9 years ago
I think you have it right.
Also shows that it should be 2.0/power
not 1.0/power
when making this transformation.
What about the line: data[:max_lag] = sqrt(sigma2[:max_lag]) * sign(errors[:max_lag])
should there really be the sign() function on errors?
This is to backcast the data
to +/-
sigma2. It isn't possible to directly use errors
since they do not follow a GARCH process. Essentially this means that in a GJR-GARCH(1,1,1) the the evolution of the first fake data point will just be
sigma2[1] = omega + alpha * sigma2[0] + gamma * sigma2[0] * (errors[0]<0) + beta *sigma2[0]
You should always burn at least max([p,o,q])
data points, and in practice many more. I don't understand a case where you would want to burn 0.
You burn 0 when you want to simulate returns/sigmas over the next N days, starting from today.
Also, errors in the above case are supposed to be i.i.d, and not follow a GARCH process. For example, in the EGARCH simulate() function, it correctly initializes the data to sqrt(sigma2) * errors. In case of pure GARCH model, the above equation will give massively outsized first returns.
simulate
does not do what you want - in some special cases, e.g. a GARCH(1,1), you can almost get it to do this by correctly choosing initial_value
, but for more general models, e.g. a GARCH(2,2) you cannot use simulate to simulate out-of-sample values. since the simulation code would need both initial_values
for sigma2
as well as the final two values of epsilon to correctly simulate this. I suppose this is more of a feature request since it is not what is intended and would require non-trivial modification. It would probably make more sense to add a new method that could simulate starting from some point in the actual sample, i.e. simulate_from_data
, which would use the fitted model. This is closer to a forecast
method IMO than straight simulation of a process, which is what these methods were designed for.
This said, there is no reason not to accept your suggestion since both
sqrt(sigma2[:max_lag]) * errors[:max_lag]
and
sqrt(sigma2[:max_lag]) * sign(errors[:max_lag])
do not matter when burn is reasonably large, so no reason not to have the former since it can be used in either ARCH(1) or GARCH(1,1) models to simulate from the end of a sample (I believe these are the only 2 where it would be correct).
Thanks for the comments Kevin. Indeed, simulate_from_data is what I am more used to and the current simulate is not too familiar to me. MATLAB's garchsim() follows the simulate_from_data approach. I'm close to finishing the simulate_from_data for a few GARCH models, but I'm not yet sure it would fit into the current arch package design, hence don't know if I should submit the code.
In volatility.py, in GARCH class, function simulate, the following code: sigma2[:max_lag] = initial_value * inv_power data[:max_lag] = sqrt(sigma2[:max_lag]) \ sign(errors[:max_lag]) seems to be taking the square root of the initial_value twice, because inv_power == 0.5 by default.
In any case, this shouldn't be like that, because implicitly, it is assuming that the previous error was equal to 1 or -1, and then sqrt of sigma2 will be a huge number, hence the first return (residual), in data, will be on average huge. It should replace the first max_lag members in errors variable with previous returns (residuals) instead.
This becomes significant if burn=0, which is the case for several practicals applications of the simulate function.
I think this is a bug, but apologies if I misunderstood the logic.