Open dmacko232 opened 2 years ago
Take note that the dataset columns (column corresponds to a key in json) are a bit different in a cleaned dataset. For example, the metadata
dictionary column is separated into different columns with names prefixed by metadata_
.
Currently, it should be easy to add evaluator into feature engineering after there were some major changes to the pipeline. See #31
Consider adding evaluator that computes statistics into feature engineering. Also not sure if swapping order of records selection -> feature engineering to feature engineering -> records selection is not better. And moving feature scaling into feature selection might be better also.