Closed morrissharp closed 2 years ago
feat_sel_sequential.ipynb needs a few more comments:
BasicImputer
to fill missing values, and both EncoderOrdinal
and EncoderOHE
to deal with categorical variables. These transformations are passed in the transform_pipe
parameter as a list. When transform()
is called on this or another dataset, these three transformations will be performed prior to SeqFeatSelection
.SeqFeatSelection
can be performed on datasets without column names. The next few cells demonstrate how to use SeqFeatSelection
on datasets without column names, similar to the example above. The Imputation example could use some more explanatory comments https://github.com/microsoft/responsible-ai-toolbox-mitigations/blob/main/notebooks/dataprocessing/module_tests/imputation.ipynb
The Rebalance imblearn example has no explanatory comments https://github.com/microsoft/responsible-ai-toolbox-mitigations/blob/main/notebooks/dataprocessing/module_tests/rebalance_imbl.ipynb
Each of the scalars should have a short description of what the scaler actually does, before providing a link to the sklearn docs. E.g. for DataStandardScaler: this Scaler transforms the data to have zero mean and unit variance.
This short description could also be used in scaler.ipynb
, where the cells using these scalers do not have descriptions.
case2.ipynb
needs more comments explaining what is going on.
case2_stat.ipynb
needs comments. Also, it would be helpful to note that some of these cells take a longer to run (10+ min, etc.).
https://sturdy-barnacle-3b9f911d.pages.github.io/databalanceanalysis/databalanceanalysis.html#databalanceanalysis.aggregate_measures.AggregateBalanceMeasure
FeatureBalanceMeasure, DistributionBalanceMeasure, and AggregateBalanceMeasure classes should have docstrings, at the very least to explain the params required for init