Open mitchelloharawild opened 1 year ago
Do people get to vote on it ;)? I like options A and B - isn't it possible to implement both of them (or A and C for that matter) if reconcile
is S3? (Not sure if implementing both is a good design choice)
Talking with Tommy, the role of reconciliation is unclear. In this framework, we are doing:
data |>
... |>
model(...) |>
reconcile(...) |>
forecast(...)
However, the real strength of reconciliation is that it is based on forecasts, not models. For example, the previous structure does not match with judgmental forecasts. In such situations, we need somenthing like this
data |>
... |>
forecast(...) |>
reconcile(...)
However, how to take the residuals for the covariance matrix is still a problem with this configuration. We need to talk more about that.
Yes, welcoming votes and discussion.
I just chatted with @robjhyndman and come up with option d, where the function describes the utilisation of forecasts across the graph. This is my current preference, and is most similar to our current interface (min_trace()
-> all_nodes()
) or something similar.
It's possible to implement all of the above at the same time, but that could be confusing as many functions give the same result.
We can also have a reconcile()
method for <fable>
classes if it is really needed, but I don't see why this is required yet.
Could you elaborate on the judgemental forecast reconciliation a bit more?
We can also have a
reconcile()
method for<fable>
classes if it is really needed, but I don't see why this is required yet. Could you elaborate on the judgemental forecast reconciliation a bit more?
Yes sure. The judgemental forecasting (e.g. the Delphi method) I was referring to is just an example where reconciliation should be applicable when the model object is not readily available.
For example, when one has forecasts that do not come from the fable package (maybe come from computationally intensive machine learning models in python or c++) but are stored in a csv file and loaded in R as a fable object, reconciliation should still be possible, since reconciliation depends on the forecasts themselves (and the covariance matrix), not models that generate the forecasts.
If fable contains all possible forecast models so forecasts can come from fable in any case, then building the reconcile function only on top of mable objects may be reasonable, but that might be too strong of an assumption to make.
I agree with @daniGiro in saying that reconciliation should be independent of the models. When I first read how fable
implement reconciliation (the current interface), I thought reconcile
coming before forecast
is because of practical considerations such that some information is only accessible in mable
s (e.g. covariance matrix) (and the actual reconciliation is done inside forecast
method anyway), but
data |>
... |>
forecast(...) |>
reconcile(...)
is really how I think of reconciliation.
I agree that it should be possible to reconcile a <fable>
, but also think that it should be possible to impose reconciliation constraints on a list of models in a <mable>
. I think we should support both, but reconciling a <fable>
will require more inputs (such as its weights / residuals / response / etc.)
The current interface of reconciling a mable
is part practical and part conceptual. Broadly speaking I think reconciliation (or producing coherent forecasts) is satisfying some additional constraints on the model. If these constraints are imposed, they should also hold true for in-sample fitted values and residuals. In the future I plan for fitted()
to optionally provide a <fable>
output which can/is coherent.
From my discussions with @mitchelloharawild today, there seems to be some overlap between "reconciliation of forecasts" and "reconciliation of data" more generally -- i.e. users might need to do "reconciliation" on imported data before any modelling.
In general, it could be useful to make something like top_down()
in OPTION D applicable to more than just the forecasts. I'm working on a functional approach (in the "matrices as a map" sense) to data harmonisation in {conformr} that could be extended to facilitate reconciliation/coherency of data.
Had another discussion with @robjhyndman today, mostly about graph reconciliation data structures.
We have also discussed functions to impose aggregation constraints into a tsibble. A weights column could be used for defining linear combination weights across nodes, but for arbitrary graphs an edge linked weights matrix might be needed.
Option D is the interface we're leaning toward, the function drastically changes the output and is simple to learn. All other parameters can then be arguments with suitable defaults.
Some practical examples on graph coherency would be useful before finalising an interface for this. From my imagining it seems that graphs are usually suited to being the only aggregation column, but it is theoretically possible to nest and cross these graph hierarchies.
I'm struggling to think of a graph linear combination reconciliation problem that isn't adequately represented with grouped hierarchies. The closest I've come is this toy example:
The number of apples and oranges sold determine both the total weight of produce and the total price/sales over time. Then these metrics are combined to give some measure of value. :shrug:
I'm going to continue trying to think of useful graph reconciliation problems, but it may not be something necessary to incorporate into the interface. If it is incorporated I think the graphs would be represented via a single key column with some extra attributes that describe the relations between nodes.
I've thought about this more from a data structures perspective and I think it is neatest to store a graph of the constraints in the tsibble/mable/fable objects.
https://arxiv.org/abs/2204.09231 provides some details on how to keep some nodes immutable, which I think is how we should handle 0 variance nodes.
I've thought about this more from a data structures perspective and I think it is neatest to store a graph of the constraints in the tsibble/mable/fable objects.
Yes, I think so too
User-defined control parameters.
Construction
Weight matrix (typically requires access to model object and varies with data structure)
Optimisation technique
Data structure
Combination method/type
Are there more things that can be customised here?
User interface
Data structure and value combination method/type
Data structure and combination method are passed in via data attributes created at the
aggregate_*()
step. Allow the user to directly impose data structure constraints, for example defining a pre-existing aggregation structure from the data. This can also be used to remove aggregation structure to create disjoint hierarchies For example, you may have a cross-temporal structure but only want to make it temporally coherent. To achieve this, you can remove the key aggregation constraints.Hold onto aggregation structure in
<tsibble>
, and<mdl_lst>
Code
Allow reconciliation of mables, fitted models, and model definitions.
Option A -
reconcile()
on model with all params as argsOption B -
reconcile()
on mable with opt function as reconcile input fnOption C -
reconcile()
on mable with construction function as reconcile input fnOption D -
reconcile()
on mable with node utilisation function as reconcile input fnAttention: @danigiro, @robjhyndman, @GeorgeAthana