giotto-ai / giotto-tda

A high-performance topological machine learning toolbox in Python
https://giotto-ai.github.io/gtda-docs
Other
849 stars 174 forks source link

Refactor plotting functions to match scikit-learn API #159

Closed lewtun closed 4 years ago

lewtun commented 4 years ago

Description

scikit-learn now includes a plotting API to visualise metrics like the confusion matrix or ROC curves:

The interesting question is whether it is possible to retain the functionality of our Mapper API, while refactoring to a form that is closer to scikit-learn's one. This would require the creation of a MapperGraphDisplay() class and corresponding plot_mapper_graph() function.

A minimal change in preparation for the next release would be to refactor the existing functions as follows:

create_static_network(...) --> plot_static_mapper_graph(...)
create_interactive_network(...) --> plot_interactive_mapper_graph(...)

Some open issues to my mind are:

See also #138

gtauzin commented 4 years ago

As plotly has become a compulsory dependence, it is indeed also interesting to add visualisation capabilities to the modules of the library related to persistent homology. One way would be to create function such as:

As we discussed this could be interfaced with the following estimator-specific functions:

The first two could rely on the third one through the creation of a TransformerPlotterMixin

Additionally the gtda.Pipeline could be updated to provide a plot function that could plot the transformed data at each step (for a specific sample most likely).

@rth A while ago, I also add a look at scikit-yellowbrick that aims at providing a plotting API for scikit-learn. Do you know how mature and relevant it is for us?

rth commented 4 years ago

A while ago, I also add a look at scikit-yellowbrick that aims at providing a plotting API for scikit-learn. Do you know how mature and relevant it is for us?

I think yellowbrick is quite relevant. For reference the scikit-learn plotting API can be found in https://scikit-learn.org/dev/developers/plotting.html I have mixed feelings about mixing plotting (with plotly) with calculation with in the same estimator, and we should consider what side effects this could have (e.g. for pickling), but if some estimators are never used without plotting then it makes sense.

gtauzin commented 4 years ago

@rth Thanks!

If no attributes are instanciated in the plot method itself, would it have any side effects? I believe pickling would not be a problem if those method do not modify the instance of a class in any way.

rth commented 4 years ago

If no attributes are instanciated in the plot method itself, would it have any side effects? I believe pickling would not be a problem if those method do not modify the instance of a class in any way.

Indeed. If plotly is a mandatory dependency it should probably be fine.