amueller / word_cloud

A little word cloud generator in Python
https://amueller.github.io/word_cloud
MIT License
10.14k stars 2.32k forks source link

Dark fringe on pale text on pale background #457

Open DouglasLapsley opened 5 years ago

DouglasLapsley commented 5 years ago

There is a dark fringe on text created with alpha channel and saved as png which is most obvious when text is light and background is light.

I assume this to be because the png alpha is premultiplied against black. It is possible to please specify a colour value in the background_color property and use a different property to specify whether the alpha is created?

Thank you.

wc = WordCloud(
    max_words=50, 
    mask=mask,
    margin=10, 
    random_state=1, 
    background_color=None, 
    mode='RGBA',
    font_path=font_path,
    scale=2,
    repeat=True
    ).generate_from_frequencies(text)
wc.recolor(color_func=get_color_func)
wc.to_file('wordCloudOutput.png')

Result

Deps:

astroid==1.6.5
backports.functools-lru-cache==1.5
configparser==3.5.0
cycler==0.10.0
enum34==1.1.6
futures==3.2.0
isort==4.3.4
kiwisolver==1.0.1
lazy-object-proxy==1.3.1
matplotlib==2.2.3
mccabe==0.6.1
numpy==1.15.4
Pillow==5.3.0
pkg-resources==0.0.0
pylint==1.9.3
pyparsing==2.3.0
python-dateutil==2.7.5
pytz==2018.7
singledispatch==3.4.0.3
six==1.12.0
subprocess32==3.5.3
wordcloud==1.5.0
wrapt==1.10.11
amueller commented 5 years ago

Please provide self-contained example code, including imports and data, so that other contributors can just run it and reproduce your issue. Ideally your example code should be minimal.

The results link goes to a jpeg with substantial compression artifacts, I'm not entirely sure what you're referring to.

DouglasLapsley commented 5 years ago

Sure no problem. Here is the whole code

Ignore the background image in the image I sent as that is artifacty, but that's not the issue. If you look at the pale blue text of the "P" you'll see that there is what looks like JPG artifacting, but I think is actually greying coming from the premultiplication of the alpha on black. If I were to change the premultiplication colour to a salmon colour, this would be reduced because the background colour would then be a closer match to the premultiplication colour. At least that's my understanding of it. I may be wrong of course.

See here for more information

Would be great to see a vector output option as this would solve the problem too.

Great generator by the way :) Thanks!

SgtChrome commented 4 years ago

This is still a problem which renders the library almost useless for my usecase. I tried to come up with a solution, but the only workaround I've found is using a background color during initial rendering and replacing it later with transparency. This only works if the wordcloud is supposed to be used on a uniform background of which I know the color beforehand.

amueller commented 4 years ago

Sorry for the slow reply. Can you give a small example where this is severe? It's also likely an issue with PIL, not wordcloud. @DouglasLapsley RGBA is not using premultiplication. You can use RGBa for using a premultiplied alpha channel.

Here is what I get with the example above, where I guess there is a bit of a subtle dark border:

image

amueller commented 4 years ago

using white text it's a bit more pronounced: image

amueller commented 4 years ago

Here's a minimum example btw:

from wordcloud import WordCloud

def get_color_func(word, **kwargs):
    return '#ffffff'

wc = WordCloud(
    max_words=50,
    margin=10,
    random_state=1,
    background_color=None,
    mode='RGBA',
    repeat=True
    ).generate('Hello world hello hello world')

wc.recolor(color_func=get_color_func)
wc.to_file('wordCloudOutput.png')
amueller commented 4 years ago

You can easily solve the problem by setting the background to a different transparent color, so the problem goes away if you do

from wordcloud import WordCloud

def get_color_func(word, **kwargs):
    return '#ffffff'

wc = WordCloud(
    max_words=50,
    margin=10,
    random_state=1,
    background_color="#FFFFFF00",
    mode='RGBA',
    repeat=True
    ).generate('Hello world hello hello world')

wc.recolor(color_func=get_color_func)
wc.to_file('wordCloudOutput.png')
amueller commented 4 years ago

You can also use premultiplied alpha, which would get rid of the problem, but can not be stored in PNG, using mode="RGBA".

Does either of you want to send a PR with additions to the documentation?