giotto-ai / giotto-tda

A high-performance topological machine learning toolbox in Python
https://giotto-ai.github.io/gtda-docs
Other
858 stars 175 forks source link

Mapper visualisation refactor: fix bugs, add summary statistics in hovertext, improve opacity, remove matplotlib dependency, add node_scale kwarg, add clone_pipeline kwarg to interactive plots,restructure/rename plotly_kwargs, improve code #406

Closed ulupo closed 4 years ago

ulupo commented 4 years ago

Reference issues/PRs Fixes #399, #403 and #404 (thanks @miltminz for reporting these). It is an alternative to #405 by @MikeG112, developed in parallel to that solution. It also contains an implementation of a part of #382.

Types of changes

Description This is a thorough refactoring of gtda.mapper, It achieves the following:

Checklist

ulupo commented 4 years ago

@MonkeyBreaker, as perhaps you expected the Azure C++ errors we say in pyflagser seem to have finally appeared in the CI for this project too.

MonkeyBreaker commented 4 years ago

Looking at the error, It actually seems to be same same problem as in pyflagser.

We could apply the same fix set(CMAKE_CXX_STANDARD 14) into CMakeLists.txt. Would you want me to do a PR or can we integrate the fix into this PR ?

ulupo commented 4 years ago

Would you want me to do a PR or can we integrate the fix into this PR ?

Thanks for the quick reply. I think it would be better to place the fix in another PR. Thanks!

MonkeyBreaker commented 4 years ago

You want to do it or do I proceed ?

ulupo commented 4 years ago

@MonkeyBreaker, feel free to proceed!

ulupo commented 4 years ago

Thanks @wreise!

Is it possible to format the strings in summary statstics to display only a few digits?

I thought about it, my issue with that is this: imagine all numbers are in some small range, e.g. (0, 0.0001). If you simply used number of decimals and e.g. set it to 3, all numbers would be shown as 0.

ulupo commented 4 years ago

@wreise and @miltminz, I've now implemented a rounding which defaults to 3 significant figures (not decimals, to avoid the kind of issue I mentioned above).

ulupo commented 4 years ago

@lewtun just adding you as a reviewer for old times sake. Please feel free to ignore as @wreise will be able to follow this. If I had to pick one thing you might be in the best position to try I'd say it is the new behaviour of plotly_params (formerly plotly_kwargs).

lewtun commented 4 years ago

regardingmatplotlib's rgb2hex function, have you checked that the node coloring still works in the 3D plots?

from memory, the reason we had to include the rgb2hex function was specifically to handle inconsistent colorings in 3D.

ulupo commented 4 years ago

Thanks @lewtun, I see no strange behaviour in the 3D plots, but I haven't tested extensively. But e.g. I haven't observed situations where everything becomes monochromatic. Do you think you could reconstruct a relevant example?

Edit: I misread "3D" for "interactive". I don't think I have checked the situation in 3D at all actually. Thanks, I'll look now.

lewtun commented 4 years ago

sure, let me checkout this branch and see if i can break it 😈

lewtun commented 4 years ago

Screen Shot 2020-05-21 at 17 45 19

Unfortunately the rgb2hex really seems to be necessary for the 3D plots 😭. Check out the screenshot, where the color of the tooltip (i.e. info box) does not match the corresponding node.

For reference, here's what the expected behaviour is like on master:

Screen Shot 2020-05-21 at 17 49 50

ulupo commented 4 years ago

Hmm, I see. Let me think about whether I can find a workaround. In any case, the argument for having the info box the same colour as the node is a wee bit weaker now that the statistic is displayed in the box.

lewtun commented 4 years ago

Good point! If you can't find a neat workaround, I suggest we use your new changes since the info box colour appears to be constant anyway, so less likely to cause confusion.

ulupo commented 4 years ago

I wasn't able to find a workaround. I get the feeling that the actual interpolation according to the colorscale (which we achieve by hand via get_cmap) is done in javascript (i.e. by plotly.js). I asked for help in https://community.plotly.com/t/hover-background-color-on-scatter-3d/9185/5?u=umberto.lupo.

ulupo commented 4 years ago

@lewtun the latest commits bring a speedy vectorised version of the hoverlabel color calculator back on the table. Feel free to try to break it again!

ulupo commented 4 years ago

@wreise additionally, I introduced a version of #382 in https://github.com/giotto-ai/giotto-tda/pull/406/commits/018392f8822ec70185ac5d57689a07bd1377ca32 and it seems to work. But I can remove it if you think it's not going to work. Otherwise, would you object if #382 continued being about testing this behaviour?