Closed mcsqr closed 1 year ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
Hey. Sorry for taking so long to review, I'll take a look now.
The errors weren't exactly related to python 3.7 but rather scikit-learn >= 1.2, which was released in Dec 2022, not so long ago to stop supporting it. Can you please use something like the following instead?
try:
encoder = OneHotEncoder(categories=categories, sparse_output=sparse_s, dtype=np.float32)
except TypeError: # sklearn < 1.2
encoder = OneHotEncoder(categories=categories, sparse=sparse_s, dtype=np.float32)
Hi Jose, yeah, thanks for the tip. I did it.
Note: the currently biggest point of discussion is the last bullet point in the PR description. It seems that the change you (Fede) have introduced in PR #190 is very unintuitive to users. I'm not sure what's the right way to handle this, but you have some issues open and I also had complaints from my collaborators. The warning I added is just a small plaster on a big wound. But this could also be taken out of this PR if desired.
@jmoralez I think all done.
Some minor but somewhat helpful changes:
S.values
which creates the dense matrix from S whensparse_s = True
in theaggregate
function. This avoids OOM errors for very large datasets (I've used it [theaggregate
function] successfully on a 400k time-series dataset after the change).sparse
argument. Backwards compatibility with old sklearn libraries is maintained.Excluded from this PR:
aggregate
function, as current behaviour is quite unintuitive and leads to users' confusion. [EDIT: Aaah, ok, it seems that this behaviour was introduced as a result of this PR here: Nixtla/hierarchicalforecast/pull/190, see also this issue: Nixtla/hierarchicalforecast/issues/211 . What was the reason again to move this lineY_bottom_df = Y_bottom_df.groupby(['unique_id', 'ds'])['y'].sum().reset_index()
two lines up?] [EDIT 2: Perhaps there should be an additional argument with an option whether to prepend the shorter time series with zeros, or not?]