yhat / ggpy

ggplot port for python
http://yhat.github.io/ggpy/
BSD 2-Clause "Simplified" License
3.7k stars 572 forks source link

Default limits add excessive padding, esp. difficult with facet(scales='free') #516

Open jdanbrown opened 8 years ago

jdanbrown commented 8 years ago

Python ggplot (left) adds excessive padding—I'm guessing because it computes tick-aligned limits—whereas R ggplot2 (right) more nicely computes tighter default limits that aren't tick aligned:

python R
ggplot(df, aes(x='index', y='act')) + \
    geom_point(alpha=.5) + \
    facet_wrap(x='layer', scales='free')
ggplot(df, aes(index, act)) +
  geom_point(alpha=.5) +
  facet_wrap(~layer, scales='free')
ggplot(df, aes(x='index', y='act')) + \
    geom_point(alpha=.5)
ggplot(df, aes(index, act)) +
  geom_point(alpha=.5)
df.csv: https://gist.github.com/jdanbrown/d152eceeee2759f9d8bf9622de4c0f49

It's straightforward to write a layer that computes "tight" limits, but the code I came up with doesn't layer well with facet(scales='free') since the data is computes from is the data across all facets, and even then its computed limits appear to only apply to the last facet (#517):

class gg_xtight(object):
    def __init__(self, margin=0.05):
        self.margin = margin
    def __radd__(self, gg):
        gg         = deepcopy(gg)
        xs         = gg.data[gg._aes['x']]
        lims       = [xs.min(), xs.max()]
        margin_abs = float(self.margin) * (lims[1] - lims[0])
        gg.xlimits = [xs.min() - margin_abs, xs.max() + margin_abs]
        return gg

class gg_ytight(object):
    def __init__(self, margin=0.05):
        self.margin = margin
    def __radd__(self, gg):
        gg         = deepcopy(gg)
        ys         = gg.data[gg._aes['y']]
        lims       = [ys.min(), ys.max()]
        margin_abs = float(self.margin) * (lims[1] - lims[0])
        gg.ylimits = [ys.min() - margin_abs, ys.max() + margin_abs]
        return gg

class gg_tight(object):
    def __init__(self, margin=0.05):
        self.margin = margin
    def __radd__(self, gg):
        return gg + gg_xtight(self.margin) + gg_ytight(self.margin)
python python + gg_tight
ggplot(df, aes(x='index', y='act')) + \
    geom_point(alpha=.5)
ggplot(df, aes(x='index', y='act')) + \
    geom_point(alpha=.5) + \
    gg_tight()
ggplot(df, aes(x='index', y='act')) + \
    geom_point(alpha=.5) + \
    facet_wrap(x='layer', scales='free')
ggplot(df, aes(x='index', y='act')) + \
    geom_point(alpha=.5) + \
    facet_wrap(x='layer', scales='free') + \
    gg_tight()

Is there a good approach to improving this? Figured I'd ask for pointers before diving too deep into the faceting code.

jdanbrown commented 8 years ago

Here's an illustration using the sample mpg data, but it's less compelling because the data happens to be less aligned on axis ticks:

python R
ggplot(mpg, aes(x='displ', y='hwy')) + \
    geom_point() + \
    facet_wrap(x='class', scales='free')
ggplot(mpg, aes(displ, hwy)) +
  geom_point() +
  facet_wrap(~class, scales='free')
ggplot(mpg, aes(x='displ', y='hwy')) + \
    geom_point()
ggplot(mpg, aes(displ, hwy)) +
  geom_point()