Closed mgiangreco closed 7 years ago
Not sure if I understand you well. I will need more feedback:
The grouping order can be changed to produce an __item1_Forecast that can be relevant (one can experiment with group1_item2 instead of item1_group2 ;).
Here's a toy example:
https://github.com/mgiangreco/pyaf_demos/blob/master/grouping_demo.ipynb
You can see in the last cell that the '2017-01-06' forecasts for 'item1_group1_OC_Forecast' and 'item1_group2_OC_Forecast' are slightly different, even though the forecasts are for the same item. My question is how to resolve this discrepancy.
Here, are you asking why item1 forecast is different from a group to another, even if item1 is the same column in both groups ?
Hierarchical forecasts are different for both groups (item1 is not grouped the same way). That's normal. unless both groups are also identical.
As an example, Monday forecast will be different if you group it with week days or month days.
Probably , I did not understand everything here.
Yes I think you understood my question. This is interesting:
"Monday forecast will be different if you group it with week days or month days."
To use the example provided in the Hydman/Athanasopoulos text:
"...series can be naturally grouped together based on attributes without necessarily imposing a hierarchical structure. For example the bicycles sold by the warehouse can be for males, females or unisex. They can be used for racing, commuting or recreational purposes. They can be single speed or have multiple gears. Frames can be carbon, aluminium or steel."
The forecast for bicycles sold by the warehouse would be different, depending on which group (male vs. female vs. unisex, racing vs. commuting vs. recreational, single speed vs. multiple gears, carbon vs. aluminum vs. steel) is chosen.
It seems like what we would normally want in this case is not 4 separate forecasts, but rather one single forecast for bicycles sold that reconciles all of the data from the non-hierarchical groups to which the bicycle belongs. But I understand that this may be a limitation of the chosen method, rather than of your implementation, so I will close the issue.
It is probably a semantic issue (choice of grouping and grouping order). There is a lot of possible groupings.
This is something that one cannot transmit through a toy example.
Let's say you have item1, which belongs to two groups: group1 and group2.
The columns of data are then: DateColumn, item1_group1, item1_group2
Training a hierarchical model on this data results in a forecast with columns like this: DateColumn, item1_group1_Forecast, item1_group2_Forecast
But we just want a single forecast column for item1. How do we resolve this?