has2k1 / plotnine

A Grammar of Graphics for Python
https://plotnine.org
MIT License
4.02k stars 217 forks source link

Custom number of density levels #172

Closed ghost closed 6 years ago

ghost commented 6 years ago

This gives three contour levels. I'm wondering how I might be able to get more levels.

import plotnine as p9
from  plotnine.data import faithful
from plotnine import ggplot, aes
import matplotlib.pyplot as plt

p9.options.figure_size = (15,15)
g = (ggplot(data=faithful)
     + p9.geom_density_2d(aes(x='eruptions', y='waiting', color='..level..'))
)
g.save('geyser.png')

enter image description here

For example, ggplot2's stat_density2d gives this plot by default, but geom_contour can be used with the bins argument to set the number of levels. Plotnine doesn't have geom_contour. There's a related question with a bugfix in ggplot2.

How can I set custom number of bins, or custom break locations as in ggplot2's stat_contour(breaks=c(120,140,160))?

enter image description here

has2k1 commented 6 years ago

You can get more levels, by giving a hint with the levels parameter.

ghost commented 6 years ago

Thanks. I can confirm levels=10 works.

import numpy as np
import plotnine as p9
from  plotnine.data import faithful
from plotnine import ggplot, aes
import matplotlib.pyplot as plt

p9.options.figure_size = (15,15)

g = (
    ggplot(aes(x='eruptions', y='waiting'), data=faithful)
    + p9.geom_point()
    + p9.stat_density_2d(
        aes(fill='..level..'),
        levels=10,
        geom='polygon',

    )
)

g.save('geyser.png')

geyser

I noticed those shapes on the side don't appear in the R version.

library(ggplot2)

m <- (
    ggplot(faithful, aes(x = eruptions, y = waiting))
    + geom_point()
    + xlim(0.5, 6)
    + ylim(40, 110)
)
m + stat_density_2d(aes(color = stat(level), fill = stat(level)), geom = "polygon")
ggsave('r-geyser.png')

r-geyser

Trying to understand/diagnose where those side shapes came from, I tried specifying an array of levels, as described in the docs.

However,


import numpy as np
import plotnine as p9
from  plotnine.data import faithful
from plotnine import ggplot, aes
import matplotlib.pyplot as plt

p9.options.figure_size = (15,15)

g = (
    ggplot(aes(x='eruptions', y='waiting'), data=faithful)
    + p9.geom_point()
    + p9.stat_density_2d(
        aes(fill='..level..'),
        levels=np.array([.05, 0.1, 0.15, 0.2]),
        geom='polygon',

    )
)

g.save('geyser.png')

raises

/home/user/.envs/practice/src/plotnine/plotnine/ggplot.py:708: UserWarning: Saving 15 x 15 in image.
  from_inches(height, units), units))
/home/user/.envs/practice/src/plotnine/plotnine/ggplot.py:709: UserWarning: Filename: geyser.png
  warn('Filename: {}'.format(filename))
Traceback (most recent call last):
  File "demo.py", line 23, in <module>
    g.save('geyser.png')
  File "/home/user/.envs/practice/src/plotnine/plotnine/ggplot.py", line 730, in save
    raise err
  File "/home/user/.envs/practice/src/plotnine/plotnine/ggplot.py", line 727, in save
    _save()
  File "/home/user/.envs/practice/src/plotnine/plotnine/ggplot.py", line 713, in _save
    fig = figure[0] = self.draw()
  File "/home/user/.envs/practice/src/plotnine/plotnine/ggplot.py", line 190, in draw
    return self._draw(return_ggplot)
  File "/home/user/.envs/practice/src/plotnine/plotnine/ggplot.py", line 197, in _draw
    self._build()
  File "/home/user/.envs/practice/src/plotnine/plotnine/ggplot.py", line 305, in _build
    layers.compute_statistic(layout)
  File "/home/user/.envs/practice/src/plotnine/plotnine/layer.py", line 87, in compute_statistic
    l.compute_statistic(layout)
  File "/home/user/.envs/practice/src/plotnine/plotnine/layer.py", line 363, in compute_statistic
    data = self.stat.compute_layer(data, params, layout)
  File "/home/user/.envs/practice/src/plotnine/plotnine/stats/stat.py", line 271, in compute_layer
    return groupby_apply(data, 'PANEL', fn)
  File "/home/user/.envs/practice/src/plotnine/plotnine/utils.py", line 632, in groupby_apply
    lst.append(func(d, *args, **kwargs))
  File "/home/user/.envs/practice/src/plotnine/plotnine/stats/stat.py", line 269, in fn
    return cls.compute_panel(pdata, pscales, **params)
  File "/home/user/.envs/practice/src/plotnine/plotnine/stats/stat.py", line 302, in compute_panel
    new = cls.compute_group(old, scales, **params)
  File "/home/user/.envs/practice/src/plotnine/plotnine/stats/stat_density_2d.py", line 104, in compute_group
    data = contour_lines(X, Y, Z, params['levels'])
  File "/home/user/.envs/practice/src/plotnine/plotnine/stats/stat_density_2d.py", line 159, in contour_lines
    x, y = np.vstack(segments).T
  File "/home/user/.envs/practice/lib/python3.6/site-packages/numpy/core/shape_base.py", line 234, in vstack
    return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
ValueError: need at least one array to concatenate
has2k1 commented 6 years ago

I noticed those shapes on the side don't appear in the R version.

When the contours go way past the limits of the plot, there is no way for the drawing algorithm to properly join the connected points, hence the artefacts. The solution is to expand the limits, just like in the R version. e.g.

+ lims(x=(0.5, 6), y =(40, 110))

I tried specifying an array of levels

Try adding a decimal place to those levels so that they are feasible.

levels=[.005, .01, .015, .02),

Levels outside the bounds trigger a bug.

ghost commented 6 years ago

Confirming it works. Thanks again.

import numpy as np
import plotnine as p9
from  plotnine.data import faithful
from plotnine import ggplot, aes
import matplotlib.pyplot as plt

p9.options.figure_size = (15,15)

g = (
    ggplot(aes(x='eruptions', y='waiting'), data=faithful)
    + p9.geom_point()
    + p9.stat_density_2d(
        aes(fill='..level..'),
        levels=np.array([.05, 0.1, 0.15, 0.2])*0.1,
        geom='polygon',
    )
    + p9.lims(x=(0.5, 6), y =(40, 110))
)

g.save('geyser.png')

geyser