Timeseries models derived from generative graph

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2024-03-11T18:53:02Z ----------------------------------------------------------------

In ~~This~~ this notebook, we show how to model
I would motivate more why you would do that, instead of just using pm.AR -- what are the benefits? In which cases is it worth the trouble dealing with pytensor and scan directly?

ricardoV94 commented on 2024-03-13T09:58:34Z ----------------------------------------------------------------

Agree with Alex. The main motivation is this framework allows you to define many arbitrary timeseries, not just things that are pre-packaged in PyMC. For the AR example, one could for instance add different Noise (StudentT) or covariates that change over time...

ricardoV94 commented on 2024-03-13T10:07:00Z ----------------------------------------------------------------

Maybe add a second more complex example, either MA2 https://gist.github.com/ricardoV94/a49b2cc1cf0f32a5f6dc31d6856ccb63#file-pymc_timeseries_ma-ipynb or one of those Jesse wrote here https://gist.github.com/jessegrabowski/ccda08b8a758f882f5794b8b89ace07a ?

jessegrabowski commented on 2024-03-13T10:28:06Z ----------------------------------------------------------------

I actually disagree, I think an AR(2) is a fine choice. I was going to put suggestions for other models here (ARIMA-GARCH or ETS), but I actually think it's better to keep this notebook really simple and focus on the machinery, which is quite complex.

ricardoV94 commented on 2024-03-13T10:51:14Z ----------------------------------------------------------------

Showing a non-recursive time varying parameter could be useful though? Can split into two separate notebooks?

jessegrabowski commented on 2024-03-13T10:54:36Z ----------------------------------------------------------------

I think that's a good 2nd example, because it also serves as a tutorial on the difference between outputs_info,sequences,and non_sequences

Even if it's not a time-varying parameter, maybe an example that shows how to combine an exogenous regression with an AR model, so you're just scanning in some covariate data and doing a linear model with AR distributed errors.

juanitorduz commented on 2024-05-06T12:17:35Z ----------------------------------------------------------------

Maybe add a second more complex example, either MA2?

I suggest we keep this notebook simple and work out other more complex examples in a different notebook (I can also work on it). In my experience, the first time an user sees these models can be overwhelming, so let's keep it simple for this one :D

review-notebook-app[bot] commented 3 months ago

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2024-03-11T18:53:03Z ----------------------------------------------------------------

I think we need to explicit what collect_default_updates does, as well as the scan API, which is always tricky.
Also explain what ar_init's type should be.
taps is like a countdown?

_ricardoV94 commented on 2024-03-13T10:01:31Z_ ----------------------------------------------------------------

Re: collect_default_updates, it tells PyMC that the RV in the generative graph should be updated in every iteration of the loop. Agree with adding more context on how Scan is defined and linking to the PyTensor docs for a deeper dive: https://pytensor.readthedocs.io/en/latest/library/scan.html

_juanitorduz commented on 2024-05-06T13:34:35Z_ ----------------------------------------------------------------

Addd more info!

review-notebook-app[bot] commented 3 months ago

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2024-03-11T18:53:04Z ----------------------------------------------------------------

Explain why you set all the observed data nodes to 0

jessegrabowski commented on 2024-03-13T10:51:13Z ----------------------------------------------------------------

Why is there an observed value for the initial condition? We never observe this by definition.

ricardoV94 commented on 2024-03-13T12:50:51Z ----------------------------------------------------------------

I don't see why you can't observe it?

jessegrabowski commented on 2024-03-13T13:41:37Z ----------------------------------------------------------------

Because the first observation in the data is $x_0$, so the initial conditions $x_{-1}$ and $x_{-2}$ are by definition unobserved

The way this model is written, it assumes that the first observations of the data are generated by some arbitrary normal distribution, which then go on to spontaneously kick-off an autoregressive process that describes the rest of the data. This isn't logical. The correct definition of the model should consider all observed data as part of the autoregressive process

juanitorduz commented on 2024-05-06T13:35:35Z ----------------------------------------------------------------

@jeseegrabowski I used the code you shareed on discourse and that is why I added you as a co-author :)

review-notebook-app[bot] commented 3 months ago

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2024-03-11T18:53:05Z ----------------------------------------------------------------

Can't we get rid of the observed in the first model, then use pm.observe here? That makes for a clearer API and experience for the readers

ricardoV94 commented on 2024-03-13T10:03:25Z ----------------------------------------------------------------

Agree. Let me know if something breaks

jessegrabowski commented on 2024-03-13T13:46:49Z ----------------------------------------------------------------

I blatantly plagiarized this notebook and used pm.observe in a discourse thread here, maybe it could be helpful.

juanitorduz commented on 2024-05-06T13:35:51Z ----------------------------------------------------------------

Fixed in 67ec83d

review-notebook-app[bot] commented 3 months ago

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2024-03-11T18:53:06Z ----------------------------------------------------------------

"we see the model is capturing the global dynamics of the time series. In order to have a ~~abetter~~ a better insight of the model"

jessegrabowski commented on 2024-03-13T10:24:47Z ----------------------------------------------------------------

I think a discussion of conditional and unconditional posteriors is needed here. Many users will be surprised by this posterior because they are used to seeing conditional one-step forecasts, $p(x_t | \{x_\tau\}_{\tau=0}^{t-1})$ (what you get in statsmodels/stata/everything), which are much tighter and follow the data more closely.

At the risk of scope-creep, I think it's also important to show users how to use a predictive model to get the conditional posterior. It would also be the first place in pymc-examples that shows how to use a predictive model -- up until now we only have the labs blog.

ricardoV94 commented 3 months ago

Agree with Alex. The main motivation is you can this framework allows you to define many arbitrary time-series, not just things that are pre-packagend in PyMC. For the AR example, one could for instance add different Noise (StudentT) or covariates that change over time...

pymc-devs / pymc-examples

Timeseries models derived from generative graph #642