paulbrodersen / matplotlib_venn_wordcloud

Venn diagrams with word clouds
MIT License
49 stars 12 forks source link

Some words missing in the set intersection #11

Closed vashek closed 4 months ago

vashek commented 6 months ago

Some words are not rendered in the set intersection in my simple example, depending on their frequency. Here's a Google Colab link: https://colab.research.google.com/drive/1H8qTzKh0sgC4RmGAHQbFZwxIuRfcMheb?usp=sharing

Code to reproduce:

from matplotlib_venn_wordcloud import venn2_wordcloud

words_left = {
    "abc": 1,
    "def": 5,
    "qwe": 2,
    "lorem": 1,
}
words_right = {
    "qwe": 3,
    "abc": 1,
    "zxc": 5,
    "ipsum": 1,
}

word_to_frequency={word: words_left.get(word, 0) + words_right.get(word, 0) for word in words_left.keys() | words_right.keys()}
print(word_to_frequency)
venn2_wordcloud([set(words_left), set(words_right)],
                word_to_frequency=word_to_frequency,
)

Results in: image

The middle section should have "qwe" and "abc", both are missing.

Package versions:

Collecting matplotlib-venn-wordcloud
  Downloading matplotlib_venn_wordcloud-0.2.6-py3-none-any.whl (11 kB)
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from matplotlib-venn-wordcloud) (1.25.2)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.10/dist-packages (from matplotlib-venn-wordcloud) (3.7.1)
Requirement already satisfied: matplotlib-venn in /usr/local/lib/python3.10/dist-packages (from matplotlib-venn-wordcloud) (0.11.10)
Requirement already satisfied: wordcloud in /usr/local/lib/python3.10/dist-packages (from matplotlib-venn-wordcloud) (1.9.3)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->matplotlib-venn-wordcloud) (1.2.0)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib->matplotlib-venn-wordcloud) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib->matplotlib-venn-wordcloud) (4.49.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->matplotlib-venn-wordcloud) (1.4.5)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib->matplotlib-venn-wordcloud) (23.2)
Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib->matplotlib-venn-wordcloud) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib->matplotlib-venn-wordcloud) (3.1.1)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.10/dist-packages (from matplotlib->matplotlib-venn-wordcloud) (2.8.2)
Requirement already satisfied: scipy in /usr/local/lib/python3.10/dist-packages (from matplotlib-venn->matplotlib-venn-wordcloud) (1.11.4)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7->matplotlib->matplotlib-venn-wordcloud) (1.16.0)
Installing collected packages: matplotlib-venn-wordcloud
Successfully installed matplotlib-venn-wordcloud-0.2.6
vashek commented 6 months ago

It's because of the min_font_size. Version 0.2.4 works.

paulbrodersen commented 6 months ago

Can reproduce (sort of):

Figure_1

paulbrodersen commented 6 months ago

Thanks for the great MWE, and the troubleshooting. I will look into a proper fix sometime this or next week.

paulbrodersen commented 5 months ago

Just a heads-up that I haven't forgotten about this issue. This issue and some other work I have done here have motivated me to implement more extensive changes, which unfortunately will take a little bit longer to complete.

paulbrodersen commented 4 months ago

This issue and some other work I have done have motivated me to implement more extensive changes, which unfortunately will take a little bit longer to complete.

It took even longer but it is done. I ended up making a new package that replaces matplotlib-venn and matplotlib_venn_wordcloud. Check it out here. It supports Euler and Venn diagrams for arbitrary number of sets, produces word clouds as this library (with small improvements), and implements a much better diagram layout engine.