konstantint / matplotlib-venn

Area-weighted venn-diagrams for Python/matplotlib
MIT License
496 stars 67 forks source link

added suggestion for nested Venn diagram using squares #6

Closed JohannesBuchner closed 10 years ago

JohannesBuchner commented 10 years ago

Hi Konstantin,

what do you think about adding this diagram? It's for consecutive sub-selection (subsets of subsets), and also area-proportional. I used rounded squares.

call like so:

venn_nested(sets, labels = [r'$\mathbf{%s}$ ($%d$)' % (string.ascii_uppercase[i], len(s)) for i, s in enumerate(sets)])

Cheers, Johannes

probselect

konstantint commented 10 years ago

Wow, this is nice, however I suggest we change the name to something different. Both the words "Nested" and "Venn" seem to be slightly out of place for describing this diagram. "Nested" does not fit perfectly, because it is about nesting of sets, not about the diagram, "Venn" does not fit because for most people "Venn" implies "circles" and "set intersections". Finally, there's a paper about "Nested venn diagram", which seems to coin this term for a different concept and I would not like to challenge that.

The main property of this diagram is actually the use of rounded rectangles rather than circles, so it might be called something like a "rectangular venn diagram for nested sets" (what about nested_rectangles, for example). I also presume that it is in principle possible to generalize it to a proper "rectangular venn" diagram (for three-set case, at least), where the sets do not have to be nested (see "Lewis Carroll diagrams" in the nested venn paper referenced above, for example). This would be extra cool (~ rectangular_venn3, etc).

Finally, I'd change the parameterization a bit. Unless it is a general-purpose "rectangular venn" diagram, providing actual sets in the input is confusing. I believe providing a list of numbers is conceptually better, if in addition there will be an assertion check that the number sequence is nonincreasing. Also, things like box style and the ratio of sides should be customizable via parameters I think.

JohannesBuchner commented 10 years ago

Hey, thanks for your response. I think I addressed your comments, except the file name is not changed yet.

I also implemented a tree-based visualization for classifiers. In these trees, every child is a subset of the parent set, and children are disjoint. What I had before is a special case of that -- a tree where every node has at most 1 child.

example_nested example_tree2

konstantint commented 10 years ago

Sorry, I'm somewhat busy this week, will get to merging this asap. I'll also want to have some tests there, even if those are minimal doctests.

The tree visualizer seems somewhat overkill in the sense that it is not so much a "venn" diagram any more and no sane person would go looking for such functionality in the package named matplotlib-venn. Writing docs for this functionality within the context of matplotlib-venn would already look awkward and this is a bad sign in my opinion.

I think it would make most sense to have a whole separate package for this kind of visualizations (e.g. matplotlib-trees). A tiny self-contained package (even if it is just a single .py file) is always a better choice than clumping a bunch of disparate functions together both for its users and for the maintainer. Moreover, I would envision a dedicated tree-diagram package to grow huge eventually, as this is certainly a topic that is not yet covered properly for matplotlib and there are many kinds of awesome diagrams waiting to be implemented. So think of it. If you want, I can help you with setup.py (or any other packaging issues).

JohannesBuchner commented 10 years ago

That's fair. I created the package matplotlib-subsets on pypi, available here: https://github.com/JohannesBuchner/matplotlib-subsets

I refer to your package in the README, perhaps you can do the same in yours. Thanks!

konstantint commented 10 years ago

Cool. You should probably change the github project description of your new repository. As you made it via forking it copied the "Area-weighted venn diagrams" string there.

Of course I'll add a reference.