Open ulupo opened 4 years ago
Fixed by (#135).
At the level of documentation, this is still an open issue as all docstrings need to be slightly tweaked to state array-like
instead of ndarray
when the input can be a pandas dataframe. Notice that outputs are still always ndarrays
, however.
Notice that this applies to the whole library.
Furthermore, we might wish to extend the functionality of Projection
to allow for passing column names instead of positional indices.
The original issue was fixed by (#137) for the Mapper module. But the rest of the library still needs to be looked at systematically.
Description
It is worth discussion whether we would like and want to always ensure that our MapperPipelines will be able to take pandas dataframes as inputs directly. There would be potentially many benefits to this ranging from less preprocessing by the user to added functionality for displaying summary information (colour, histograms etc), when the quantity of interest is a specific column which can be more easily accessed by name than by index location.
It is worth pointing out that a first iteration of this should not add pandas to the requirements files for giotto-learn, in a similar way as scikit-learn.