heidelbergcement / hcrystalball

A library that unifies the API for most commonly used libraries and modeling techniques for time-series forecasting in the Python ecosystem.
https://hcrystalball.readthedocs.io/
MIT License
152 stars 19 forks source link

Extend holiday and seasonality transformers by new features #40

Closed pavelkrizek closed 4 years ago

pavelkrizek commented 4 years ago

@MichalChromcak please take a look.

Reasoning behind the PR This PR aims to improve the handling of holidays effects around public holidays and during specific days. In particular, it helps to model situations when there are just one/few working days between two public holidays, Christmas and other periods with low economic activity without explicit public holidays, special days like first/last day of the year which is often not a public holiday but has a similar effect. The end/start of month/quarter is frequently connected with business processes (budgeting period) which have natural influences on numerous time series.

Implemented changes: HolidayTransformer extended by 3 new parameters - days_before, days_after, bridge_days - all of them are inactive by default

SeasonalityTransformer was extended by 6 new features - all of them are inactive by default: "_month_start", "_month_end", "_quarter_start", "_quarter_end", "_year_start", "_year_end",

MichalChromcak commented 4 years ago

@pavelkrizek Thanks for the features.

Few remarks that come to my mind and might need example/decision on handles.

Despite its undoubtful usefullness, how does the newly added days_before and days_after functionality of HolidayTransformer play along with extra holidays parameter of the prophet model? There seems to be a functional overlap we should address at least in the documentation.

days_before means how many days before public holiday should be taken into account (i.e. 2 means that 2 days before public holidays will be True), creates 1 column

can you elaborate more "creates 1 column"? Does it mean it creates 1 column similar to holiday{...} for both before and after holidays effect with 1 and 0? Or names of holidays that it ranges around?

codecov-commenter commented 4 years ago

Codecov Report

Merging #40 into master will increase coverage by 1.96%. The diff coverage is 97.61%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #40      +/-   ##
==========================================
+ Coverage   91.65%   93.61%   +1.96%     
==========================================
  Files          55       55              
  Lines        2456     2585     +129     
==========================================
+ Hits         2251     2420     +169     
+ Misses        205      165      -40     
Impacted Files Coverage Δ
tests/integration/test_models_integration.py 100.00% <ø> (ø)
tests/integration/test_set_params_get_params.py 100.00% <ø> (ø)
tests/conftest.py 86.16% <80.00%> (+0.12%) :arrow_up:
.../hcrystalball/model_selection/_data_preparation.py 96.82% <83.33%> (+0.05%) :arrow_up:
...l/model_selection/_large_scale_cross_validation.py 90.58% <90.00%> (+1.30%) :arrow_up:
src/hcrystalball/utils.py 84.17% <90.47%> (+1.12%) :arrow_up:
tests/integration/test_data_unchanged.py 97.61% <90.90%> (+17.26%) :arrow_up:
...sts/unit/model_selection/test_prefect_executors.py 95.23% <93.33%> (+11.23%) :arrow_up:
src/hcrystalball/exceptions.py 90.90% <100.00%> (ø)
...talball/feature_extraction/_holiday_transformer.py 96.36% <100.00%> (+1.91%) :arrow_up:
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 99c22be...4bfa11c. Read the comment docs.

MichalChromcak commented 4 years ago

Thanks, merging it now. Will create the notebook kernels issue