octoenergy / timeserio

Better `keras` models for time series and beyond
MIT License
60 stars 16 forks source link

Fix set_params on entire subpipelines #26

Closed ali-tny closed 4 years ago

ali-tny commented 4 years ago

MultiPipeline uses __getattr__ to expose subpipelines stored in the dict MutliPipeline.pipelines as attributes, so sklearn can access them when using .get_params and .set_params. However, __getattr__ is only called if the attribute isn't found in the objects .__dict__ - so if a whole pipeline is overwritten with pipeline.set_params, it silently creates a new attribute that is accessed instead of the original subpipeline with using getattr, but the original pipeline remains unchanged when accessed via pipeline.pipelines (which is how it's called in .transform or .fit).

In other words, using .set_params for an entire subpipeline causes .get_params to show different parameters to what's being called in the code.

This change adds a failing test & a fix: implementing __setattr__ so that the attributes and pipelines are always in sync regardless of how they are updated.

codecov-commenter commented 4 years ago

Codecov Report

Merging #26 into master will increase coverage by 0.02%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #26      +/-   ##
==========================================
+ Coverage   88.68%   88.70%   +0.02%     
==========================================
  Files          34       34              
  Lines        1723     1727       +4     
==========================================
+ Hits         1528     1532       +4     
  Misses        195      195              
Impacted Files Coverage Δ
timeserio/pipeline/multipipeline.py 90.32% <100.00%> (+1.43%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update b23d49b...32d6baa. Read the comment docs.