HK3-Lab-Team / pytrousse

PyTrousse collects into one toolbox a set of data wrangling procedures tailored for composing reproducible analytics pipelines.
Apache License 2.0
0 stars 1 forks source link

DataFrameWithInfo features columns could be a subset of metadata columns #2

Closed alessiamarcolini closed 4 years ago

alessiamarcolini commented 4 years ago

At the moment DataFrameWithInfo takes as parameters metadata_cols (tuple, i.e. the columns of the dataframe to be used as metadata) and metadata_as_features (bool, i.e. whether to consider all the metadata columns as features). It could be useful to set only a subset of the metadata columns as features, or even to exclude some columns both from the metadata and the features sets.

To accomplish these goals, DataFrameWithInfo could accept both metadata_cols and feature_cols parameters as tuples of strings enumerating the metadata and the feature columns, respectively. I suggest feature_cols to be an optional parameter, by default equal to None, meaning that all the columns but the metadata columns are features columns.