Open gvdr opened 1 year ago
That all seems fine to me. They way I have done it so far does not require the existence of classes.
The default value for everything is nothing
. If it is not nothing, it is a vector of coefficients. Multiple dispatch can differentiate between the two. Is there a particular reason to introduce new Types (Julia does not have classes, so I am guessing you mean Types in point 1). For example, to enable the "moving average" component of the process, you simply assign the "moving average" keyword the value of a Static Vector instead of nothing.
I don't think the polynomial stuff is in ARFIMA (I believe you have hardcoded the loops, right?).
I am not sure I understand what you mean by "hardcoded" (because I don't know what polynomial stuff means in this context). This is the main loop:
does it count as "hardcoded"? How much I loop depends on the user input, so I am not sure whether this qualifies as hardcoded or not.
The need for types (sorry) comes from trying to have multiple seasonality effects. Each seasonal component may have its own ar and ma coefficients. One $(2,0,3)s$ component, for example, will bring to the equation a $1-\phi_1B^s - \phi_2B^{2s}$ for the ar and a $1+\psi_1B^s + \psi_2B^{2s} + \psi_3B^{3*s}$ for the ma component (see here for example). If we want a sarima with weekly, monthly, and yearly seasonal components, we might in general need 6 new array of coefficients. An arima without seasonal component corresponds to a sarima with seasonality 1 in this fashion.
The most expressive way I could find to allow user to define potentially complicated sarimax was to define a building block type, and allowing for composition of these blocks. So, a sarima with seasonalities equal to 1, 4, 6, 9, would be written as:
this_sarima = [
sarima(ar = [...coefficients...], ma = [...coefficients...]),
sarima(ar = [...coefficients...], ma = [...coefficients...], s = 4),
sarima(ar = [...coefficients...], ma = [...coefficients...], s = 6),
sarima(ar = [...coefficients...], ma = [...coefficients...], s = 9),
]
where all the coefficients can be different (and in different number, even empty).
Oh okay. I see. Then this means that the s component (for seasonal) should have dedicated supertype and subtypes, to handle all of this simpler. However, I stand that the AR or MA components don't need dedicated types, and one can use static vectors as I have been doing so far. Right?
Absolutely right. I'm using static arrays to store the coefficients as well ☺️ (with a convenience constructor function that takes a regular vector and convert it to a SVector before calling the type constructor, so users don't need to deal directly with static vectors, unless they want to).
Okay, so then what stops us from merging the two packages?
COuldn't we have something like a "main" function sarfimax
that takes in s, ar, d, ma, x
(with d
for the integration), and just does everuthing?
Something similar seems doable.
With the minor cavaet that, for the moment, in Stranbo
the functions sarima
/ sarimax
construct the components, they don't run the simulation (that is left to sample(VectorOfComponents)
). It would be easy to a sarfimax
constructor. I would need to define a sample
method for the sarfimax
type, which will rely on your code.
Does it make sense to have "component" which is fractionally integrated? Do fractional integration admits seasonality? That is, would the user want to write something like:
this_sarfimax = [
sarima(d = 1, ar = ..., ma = ...),
sarima(d = 1, ar = ..., ma = ..., s = 4),
sarfima(d = 0.3, ar = ..., ma = ..., s = 6)
]
sample(this_sarfimax, 100_000)
or would this be nonsensical?
What's the advantage of not running the simulation and making the user go the extra step of calling sample
? Doesn't this unecessarily complicate things? Is there anything else that is possible to do with this VectorOfComponents
besides just getting the timeseries?
I don't think it makes sense to have d::Real
with seasonal components. If d
is not an integer and s
is not nothing, we should throw an error.
What's the advantage of not running the simulation and making the user go the extra step of calling sample? Doesn't this unecessarily complicate things?
I'm no sure. With an example, let's define a (1,1,1)x(1,1,0)5x(1,0,1)10. For me the choice seems to be between:
# my syntax
a_sarima_proc = [
sarima(d = 1, ar = [.8,.3,.2], ma = [.3,.1]), # non seasonal component
sarima(d = 1, ar = [.4], s = 5), # first seasonal effect
sarima(ar = [.5,.3], ma = [.4,.3,.1], s = 10) # second seasonal effect
]
sample(a_sarima_proc,100_000)
and
# standard syntax
sarima(
d = [1,1,0],
ar = [ [.8,.3,.2], [.4], [.5,.3]],
ma = [[.3,.1], nothing, [.4,.3,.1]],
s = [1,5,10]
)
For me the first is way more readable.
Is there anything else that is possible to do with this VectorOfComponents besides just getting the timeseries?
For one, it is easier to manipulate. For example, if I want to add a new seasonality effect I can just do:
new_effect = sarima(d = 2, ma = [0.4], s = 20)
new_sarima_proc = vcat(a_sarima_proc, new_effect)
sample(new_sarima_proc,100_000)
And there is some experimental feature I'm working out. So, it should be it is possible to do something like the following, to obtain 3 correlated series (eheh, I was not expecting it to just work).
using Distributions, LinearAlgebra
# we build a var-covar matrix
Σ = rand(3,3)
Σ = Σ .* Σ'
# sample n points
WNoise = MvNormal([.0,.0,.0], I(3)+Σ)
z = rand(WNoise,100_000)
# define equivalent seasonality effects
seasonalities = [
sarima(d = 1, ar = [.4], s = 5), # first seasonal effect
sarima(ar = [.5,.3], ma = [.4,.3,.1], s = 10) # second seasonal effect
]
# define base processes
b1 = sarima(ar = [.5], ma = [.5], dₙ = z[1,:])
b2 = sarima(ar = [.5], ma = [.5], dₙ = z[2,:])
b3 = sarima(ar = [.5], ma = [.5], dₙ = z[3,:])
three_trajectories = sample.(
[vcat(b1,seasonalities),
vcat(b2,seasonalities),
vcat(b3,seasonalities)],
100_000
)
# try different seasonal processes
new_seasonalities = [
sarima(ar = [.2], ma = [.1,.01,.001], s = 7), # first seasonal effect
sarima(ar = [.5,.3], ma = [.4,.3,.1], s = 10) # second seasonal effect
]
other_three_trajectories = sample.(
[vcat(b1,new_seasonalities),
vcat(b2,new_seasonalities),
vcat(b3,new_seasonalities)],
100_000
)
I'm sure all of this can be done in other syntaxes, but I think this one is rather clear and easy to use.
Okay, the vector syntax is fine.
But then, the only thing you can utilize for ARFIMA.jl is the fractional integration, right? But the fractional integration doesn't play well with seasonal. So the fractional integration will be used with vectors that always have only one component.
Can we define the convenience sample(::Component)
dispatch as well?
And what is the plan forwards now? Where do we put all methods? Do we make a new package? SARFIMAX or something like that? Do we add the seasonal stuff here?
Can we define the convenience sample(::Component) dispatch as well?
Sure, that's already there:
julia> using Stranbo
julia> test = sarima(ar = [.8], ma = [2.])
SARIMA{Float64}(1, 0, [0.8], [2.0], Distributions.Normal{Float64}(μ=0.0, σ=1.0))
julia> sample(test,1000)
1000-element Vector{Float64}:
-0.3901653040936471
-2.545535084952462
-3.169123738677272
1.3247596769491126
1.1210945293432513
1.7294814531680687
4.1878387343420105
⋮
-1.6600430819823186
-4.6308613372374134
-8.743526206444377
-8.697082865177684
-2.354005619615946
0.013866600868192025
what is the plan forwards now? Where do we put all methods? Do we make a new package? SARFIMAX or something like that?
Open to any solution. Calling it sarfimax might mean we need to change it again if we include something else? But I'm terrible with names! :-D
the only thing you can utilize for ARFIMA.jl is the fractional integration, right?
That, and I should take a look closely at how you handle the bdp()
. I've a similar operator, but I fear that is the performance bottleneck for me.
How much of the source code here can you use? Can you add your seasonal functionality to this package? And we rename this package into AutoregressiveSimulations.jl and release it a-new? If you think using the source code here is too hard, we can start a new package AutoregressiveSimulations.jl from scratch.
package name is up to debate of course.
Sorry, had very busy days, I'll get to this asap :-)
Leaving an issue here to consider whether we might integrate the functionalities I have in Stranbo and
arfima
.Stranbo is designed for generating time series with "complicated" sarimax (integer $i$) processes. Complicated means many seasonal effects (think, daily + weekly + monthly + yearly, ..., each one with its own d, ar, and ma coefficients) and/or many auxiliary components (again, each one with its own d, ar, and ma coefficients).
It is slower than ARFIMA, a little bit due to the generality, and a tad due to the fact that I don't use any
@fastmath
tricks (I got scared because of this kind of things, but maybe here it's safe?).The high-picture view of how Stranbo work is:
I believe that 4 is in mostly shared between the two packages (but I can learn a bit to improve even more). I don't think the polynomial stuff is in ARFIMA (I believe you have hardcoded the loops, right?).