Created a variety of scenarios for each optimization to test and benchmark their behavior in different situations
Added mlwhatif support for sklearn PCA, RobustScaler, DummyClassifier, and a fuzzy_merge operation from the library fuzzy-pandas.
Created a ModelVariants what-if analysis and a wrapper for existing analyses to test very specific patched pipeline variants to be able to explore interesting optimization scenarios fully
Used these scenarios to create benchmarks and run them on a powerful machine
There is still a lot of code duplication between different scenarios and optimization tests, and the benchmark plot nb is very messy at the moment. This will need to be cleaned up in the future.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Issue #, if available: #21
Description of changes:
PCA
,RobustScaler
,DummyClassifier
, and afuzzy_merge
operation from the library fuzzy-pandas.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.