sagemath / sage

Main repository of SageMath. Now open for Issues and Pull Requests.
https://www.sagemath.org
Other
1.19k stars 412 forks source link

Update matplotlib so that plot_directive is less broken #17618

Closed edd8e884-f507-429a-b577-5d554626c0fe closed 9 years ago

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago

In #17498, some problems appeared with the plot_directive module from matplotlib, in particular, the impossibility to remove the (Source code) link (that points nowhere) above images.

With 1.4.2, the plot_include_source option seems still broken but a new plot_html_show_source_link seems to work, so let us upgrade.

Tarball: http://www.lmona.de/files/sage/matplotlib-1.4.3.tar.bz2

CC: @kiwifb @strogdon @kcrisman @gagern @novoselt

Component: packages: standard

Author: Thierry Monteil, François Bissey

Branch/Commit: 9dc580d

Reviewer: Steven Trogdon

Issue created by migration from https://trac.sagemath.org/ticket/17618

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago

Branch: u/tmonteil/update_matplotlib_so_that_plot_directive_is_less_broken

kiwifb commented 9 years ago

Commit: 374eca4

kiwifb commented 9 years ago
comment:3

I am expecting broken doctests with matplotlib 1.4.x? Did you find any?


New commits:

2fdfecf#17618 package version
ef72f4b#17618 remove upstream-applied patches
188c737#17618 update setup.py.patch
374eca4#17618 update SPKG.txt
6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 9 years ago
comment:4

Is the tarball ready ?

Nathann

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:5

Replying to @kiwifb:

I am expecting broken doctests with matplotlib 1.4.x? Did you find any?

I am indeed getting some MatplotlibDeprecationWarning

Replying to @nathanncohen:

Is the tarball ready ?

As explained on this thread, i would like to be sure to make a reproducible tarball. If you want to test for #17498, you can use the upstream tarball meanwhile (we only remove some dirs it to save space but there is no difference otherwise).

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 9 years ago
comment:6

As explained on this thread, i would like to be sure to make a reproducible tarball.

It seems to be the thread on which I mentionned a workaround using "diff -R". I saw your answer but I did not understand it, and so I wait for the answer to the private email I sent you after that.

Nathann

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:7

Here are the 13 doctest failures that appeared in a make ptestlong, quite a few are redundant:

File "src/doc/en/bordeaux_2008/introduction.rst", line 75, in doc.en.bordeaux_2008.introduction
Failed example:
    list_plot3d(v, interpolation_type='nn')
Expected:
    Graphics3d Object
Got:
    doctest:137: MatplotlibDeprecationWarning: The matplotlib.delaunay module was deprecated in version 1.4. Use matplotlib.tri.Triangulation instead.
    Graphics3d Object
File "src/sage/graphs/generic_graph.py", line 15636, in sage.graphs.generic_graph.GenericGraph.plot
Failed example:
    D.plot(edge_labels=True, color_by_label={'a':'blue', 'b':'red'}, edge_style='dashed')
Expected:
    Graphics object consisting of 34 graphics primitives
Got:
    doctest:239: FormatterWarning: Exception in text/plain formatter: 'module' object has no attribute '_Base'
    None
File "src/sage/plot/arrow.py", line 349, in sage.plot.arrow.Arrow._render_on_subplot
Failed example:
    a.save(filename=filename)
Exception raised:
    Traceback (most recent call last):
      File "/opt/sagemath/sage-6.3/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 488, in _run
        self.compile_and_execute(example, compiler, test.globs)
      File "/opt/sagemath/sage-6.3/local/lib/python2.7/site-packages/sage/doctest/forker.py", line 850, in compile_and_execute
        exec(compiled, globs)
      File "<doctest sage.plot.arrow.Arrow._render_on_subplot[11]>", line 1, in <module>
        a.save(filename=filename)
      File "/opt/sagemath/sage-6.3/local/lib/python2.7/site-packages/sage/misc/decorators.py", line 471, in wrapper
        return func(*args, **kwds)
      File "/opt/sagemath/sage-6.3/local/lib/python2.7/site-packages/sage/plot/graphics.py", line 3048, in save
        figure = self.matplotlib(**options)
      File "/opt/sagemath/sage-6.3/local/lib/python2.7/site-packages/sage/plot/graphics.py", line 2494, in matplotlib
        g._render_on_subplot(subplot)
      File "/opt/sagemath/sage-6.3/local/lib/python2.7/site-packages/sage/plot/arrow.py", line 414, in _render_on_subplot
        class ConditionalStroke(pe._Base):
    AttributeError: 'module' object has no attribute '_Base'
File "src/sage/plot/arrow.py", line 523, in sage.plot.arrow.arrow2d
Failed example:
    arrow2d((1, 1), (3, 3), linestyle='dashed')
Expected:
    Graphics object consisting of 1 graphics primitive
Got:
    doctest:239: FormatterWarning: Exception in text/plain formatter: 'module' object has no attribute '_Base'
    None
File "src/sage/plot/arrow.py", line 525, in sage.plot.arrow.arrow2d
Failed example:
    arrow2d((1, 1), (3, 3), linestyle='--')
Expected:
    Graphics object consisting of 1 graphics primitive
Got:
    None
File "src/sage/plot/colors.py", line 22, in sage.plot.colors
Failed example:
    sorted(colormaps)
Expected:
    ['Accent', 'Accent_r', 'Blues', 'Blues_r', 'BrBG', 'BrBG_r', ...]
Got:
    [u'Accent',
     u'Accent_r',
     u'Blues',
     u'Blues_r',
     u'BrBG',
File "src/sage/plot/colors.py", line 1337, in sage.plot.colors.get_cmap
Failed example:
    sorted(colormaps)
Expected:
    ['Accent', 'Accent_r', 'Blues', 'Blues_r', ...]
Got:
    [u'Accent',
     u'Accent_r',
     u'Blues',
     u'Blues_r',
     u'BrBG',
     u'BrBG_r',
File "src/sage/plot/colors.py", line 1398, in sage.plot.colors.Colormaps
Failed example:
    sorted(colormaps)
Expected:
    ['Accent', 'Accent_r', 'Blues', 'Blues_r', ...]
Got:
    [u'Accent',
     u'Accent_r',
     u'Blues',
     u'Blues_r',
File "src/sage/plot/colors.py", line 1645, in sage.plot.colors.Colormaps.__delitem__
Failed example:
    maps.popitem()
Expected:
    ('Spectral', <matplotlib.colors.LinearSegmentedColormap object at ...>)
Got:
    (u'Spectral', <matplotlib.colors.LinearSegmentedColormap object at 0x4afd4a0c>)
File "src/sage/plot/graphics.py", line 1091, in sage.plot.graphics.Graphics.add_primitive
Failed example:
    G
Expected:
    Graphics object consisting of 2 graphics primitives
Got:
    doctest:239: FormatterWarning: Exception in text/plain formatter: 'module' object has no attribute '_Base'
    None
File "src/sage/plot/graphics.py", line 2096, in sage.plot.graphics.Graphics.?
Failed example:
    p._matplotlib_tick_formatter(subplot, **d)
Expected:
    (<matplotlib.axes.AxesSubplot object at ...>,
    <matplotlib.ticker.MaxNLocator object at ...>,
    <matplotlib.ticker.MaxNLocator object at ...>,
    <matplotlib.ticker.OldScalarFormatter object at ...>,
    <matplotlib.ticker.OldScalarFormatter object at ...>)
Got:
    (<matplotlib.axes._subplots.AxesSubplot object at 0x4b8da48c>,
     <matplotlib.ticker.MaxNLocator object at 0x4bbe3dac>,
     <matplotlib.ticker.MaxNLocator object at 0x4bbe3f2c>,
     <matplotlib.ticker.OldScalarFormatter object at 0x4bbe388c>,
     <matplotlib.ticker.OldScalarFormatter object at 0x4bbe332c>)
File "src/sage/plot/plot3d/list_plot3d.py", line 90, in sage.plot.plot3d.list_plot3d.list_plot3d
Failed example:
    list_plot3d(m, texture='yellow', interpolation_type='nn',frame_aspect_ratio=[1,1,1/3])
Expected:
    Graphics3d Object
Got:
    doctest:137: MatplotlibDeprecationWarning: The matplotlib.delaunay module was deprecated in version 1.4. Use matplotlib.tri.Triangulation instead.
    Graphics3d Object
File "src/sage/stats/distributions/discrete_gaussian_lattice.py", line 133, in sage.stats.distributions.discrete_gaussian_lattice.Disc
reteGaussianDistributionLatticeSampler
Failed example:
    list_plot3d(l, point_list=True, interploation='nn')
Expected:
    Graphics3d Object
Got:
    doctest:137: MatplotlibDeprecationWarning: The matplotlib.delaunay module was deprecated in version 1.4. Use matplotlib.tri.Triang
ulation instead.
    Graphics3d Object
kiwifb commented 9 years ago
comment:8

Some redundancy indeed. Let's see what's to be done:

  1. Add the missing "u" to doctest (for unicode if I am not mistaken). That's one of the easiest bit.
  2. Get rid of calls to _base in sage/plot/*
  3. Migrate from delaunay to tri.Triang
  4. Fix sage/plot/graphics.py - easy.
  5. Why no output in sage/plot/arrow.py?
kiwifb commented 9 years ago
comment:9

It looks like we may be able to get away with replacing delaunay with tri.Triangulation without any other changes. Another option is to switch to scipy's delaunay.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

bf4b81b#17618 fix bugs involving a missing _Base class in sage/plot/arrow.py
12baa7f#17618 fix unicode doctest failures in colors.py
4c6f6e1#17618 fix _subplots doctest failure in plot/graphics.py
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 9 years ago

Changed commit from 374eca4 to 4c6f6e1

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago

Author: Thierry Monteil

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:12

Replying to @kiwifb:

It looks like we may be able to get away with replacing delaunay with tri.Triangulation without any other changes. Another option is to switch to scipy's delaunay.

A drop-in replacement does not work since they do not offer the same methods. Also, we rely on nearest neighbor interpolation which, according to this page seems to require installing some natgrid package, see also this page about removing natgrid.

kcrisman commented 9 years ago
comment:13

I'm not invested in this, but just be sure to check actual pictures for a good variety of plots...

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:14

Replying to @kcrisman:

I'm not invested in this,

Uh, sorry for cc-ing you, git blame denounced you.

but just be sure to check actual pictures for a good variety of plots...

Yes, we will look at how failed doctests are visually rendered before and after the fixes.

kiwifb commented 9 years ago
comment:15

I, on the other hand, am very interested in this because updating matplotlib is a pre-requisite for updating numpy to 1.9.x (MPL 1.3.x won't build with numpy 1.9.x).

So if we cannot use matplotlib tri.Triangulation could we try scipy's delaunay instead?

kiwifb commented 9 years ago
comment:16

OK scipy doesn't offer nearest neighbour either. I think it is time to hit the list to know if there is any reason we should stick to nn. linear or cubic should make good default. I would go cubic myself as the default.

kiwifb commented 9 years ago
comment:17

If no one comment on my post on sage-devel by tomorrow we just go ahead and change the default. Is this ok with you? We may need to check the documentation.

kiwifb commented 9 years ago
comment:18

OK in the absence of comments I think we should go ahead with something.

kiwifb commented 9 years ago
comment:19

I see you didn't put up a tarball or updated the checksums so I will do it. I am testing my current changes, I am doing a build from scratch right now.

kiwifb commented 9 years ago
comment:20

After changing to triangulation there are differences with what we have before but I don't know at this stage if it is better or worse. See commit.


New commits:

8ee674dAdd checksum and migrate to tri.Trinagulation
kiwifb commented 9 years ago

Changed branch from u/tmonteil/update_matplotlib_so_that_plot_directive_is_less_broken to u/fbissey/MPL-1.4

kiwifb commented 9 years ago

Description changed:

--- 
+++ 
@@ -2,3 +2,5 @@

 With 1.4.2, the `plot_include_source` option seems still broken but a new `plot_html_show_source_link` seems to work, so let us upgrade.

+Tarball:
+[https://downloads.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.4.2/matplotlib-1.4.2.tar.gz](https://downloads.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.4.2/matplotlib-1.4.2.tar.gz)
kiwifb commented 9 years ago

Changed commit from 4c6f6e1 to 8ee674d

kiwifb commented 9 years ago
comment:21

There is also a section of code that was specifically to correct data that could segfault the delaunay code. Unfortunately there is no doctest so we cannot check whether we still need it or not.

kiwifb commented 9 years ago
comment:22

OK for the first doctest both delaunay and triangulation are bad compared to spline... Difficult to know which one is worst but nearest neighbour wasn't involved in either.

strogdon commented 9 years ago
comment:23

I'm coming late to the game. Aside from the actual triangulation it looks like there is still

interpolation_type='nn'

in listplot_3d.py which gives failures for me. And in discrete_gaussian_lattice.py there is

interploation='nn'

I guess interploation should be interpolation and the 'nn' should be something else.

strogdon commented 9 years ago
comment:24

Actually with interpolation_type='nn' no 3D-graphic is generated at all. And changing 'nn' to 'cubic' I get:

sage -t src/sage/plot/plot3d/list_plot3d.py
**********************************************************************
File "src/sage/plot/plot3d/list_plot3d.py", line 138, in sage.plot.plot3d.list_plot3d.list_plot3d
Failed example:
    list_plot3d(l,interpolation_type='cubic',texture='yellow',num_points=100)
Expected:
    Graphics3d Object
Got:
    doctest:3847: UserWarning: Warning: converting a masked element to nan.
    Graphics3d Object
**********************************************************************
kiwifb commented 9 years ago
comment:25

Yes, still work in progress, I'll have to get to all these next. We'll have to explore that last warning too.

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:26

Hi, sorry i was out of connection those days, which is why i did not answered earlier. @kiwifb i am not sure 2 days is a sufficient delay to let people answer in such community as Sage, involving so much people, none of which working full time on it.

I am not convinced yet that cubic interpolation is a suitable replacement for nearest neighborhood, both in terms of speed, meaning and visual rendering, so i guess this should be inspected further with that respect (as suggested by @kcrisman in his previous answer). I will try to work on it later in the day.

I see two things:

kiwifb commented 9 years ago
comment:27

Things happen! Anyway if you want to have an answer to something that has been posted more than 24-48hours you may need to re-post again. Admitly the list has been relatively quite the past week so it may not be drawn in the flow has it would normally be.

While it is only deprecated it is only pushing the problem away in my opinion - it is a form of procrastination.

I guess we don't completely loss delaunay if we want to hold to it, we can migrate to scipy, what we will lose fast, is the nearest neighbour default. scipy doesn't have it, MPL doesn't really want it, why should we have it? Are there any technical arguments for it? Also the underlying code under delaunay is changed in any case to be based on qhull, there may be difference, even in the linear if we decide to postpone.

jpflori commented 9 years ago
comment:28

Slightly related, I've opened #17642, #17643, #17644 for numpy/scipy/sympy updates (currently, nothing to see there though).

strogdon commented 9 years ago
comment:29

I may be missing something, and it may be a terminology thing, but it appears that scipy-0.15 does have nearest neighbor interpolation:

https://github.com/scipy/scipy/blob/master/scipy/interpolate/interpolate.py and https://github.com/scipy/scipy/blob/master/scipy/interpolate/interpolate_wrapper.py

search for nearest.

kiwifb commented 9 years ago
comment:30

Indeed, I thought I read that scipy didn't have it. May be delaunay doesn't have it.

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago

Attachment: vectors.sobj.gz

Attachment: 6.4_linear.png

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago

Attachment: 6.4_nn.png

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago

Attachment: 6.5_linear.png

Attachment: 6.5_cubic.png

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:31

I did the following test to compare the new cubic interpolation type with the former nn:

Creation of a common test (taken from discrete_gaussian_lattice.py doctest):

from sage.stats.distributions.discrete_gaussian_lattice import DiscreteGaussianDistributionLatticeSampler
D = DiscreteGaussianDistributionLatticeSampler(identity_matrix(2), 3.0)
S = [D() for _ in range(2^12)]
l = [vector(v.list() + [S.count(v)]) for v in set(S)]
save(l, '/tmp/vectors.sobj')

Then, on the former 6.4:

sage: l = load('/tmp/vectors.sobj')
sage: %timeit P = list_plot3d(l, interpolation_type='linear', texture="automatic", point_list=True)
100 loops, best of 3: 12.4 ms per loop

sage: %timeit P = list_plot3d(l, interpolation_type='nn', texture="automatic", point_list=True)
10 loops, best of 3: 29.8 ms per loop

On the patched 6.5.beta5:

sage: l = load('/tmp/vectors.sobj')
sage: %timeit P = list_plot3d(l, interpolation_type='linear', texture="automatic", point_list=True)
100 loops, best of 3: 18.8 ms per loop

sage: %timeit P = list_plot3d(l, interpolation_type='cubic', texture="automatic", point_list=True)
1 loops, best of 3: 498 ms per loop

The generation of the cubic interpolation is much (16x) slower that the nn interpolation, and the visual renderings are very different (nn looks more like a smoothed linear while cubic shows more spots at lattice points). So, i agree to add the cubic interpolation type, but disagree to let this replace nn in existing tests (too different) nor as becoming the default (too slow).

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:32

In list_plot3d.py, the following is more than dubious, especially if drop_list is not empty:

    x=[x[i] for i in range(len(x)) if i not in drop_list]
    y=[y[i] for i in range(len(x)) if i not in drop_list]
    z=[z[i] for i in range(len(x)) if i not in drop_list]
edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago

Changed branch from u/fbissey/MPL-1.4 to u/tmonteil/MPL-1.4

kiwifb commented 9 years ago

Changed commit from 8ee674d to 065dc77

kiwifb commented 9 years ago
comment:34

Hadn't seen your plots (not shown in email) that's cool work. I don't think the stuff with drop_list is spurious. The point is to eliminate double entries. Indices are added to drop_list if the corresponding point is identical to a previous one. The code you quote builds a new set of x, y and z where each set of point is unique.

On the other code that code in the generation of drop_list is a bit gauche

if z[i] != z[j]:
.....
elif z[i] == z[j]:
    drop_list.append(j)

Like there was a third case? the elif could be replaced by else in my opinion.


New commits:

1b4e74e#17618 : fix spacing (trivial)
065dc77#17618 : fix comment 32
edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:35

Replying to @kiwifb:

I don't think the stuff with drop_list is spurious. The point is to eliminate double entries. Indices are added to drop_list if the corresponding point is identical to a previous one. The code you quote builds a new set of x, y and z where each set of point is unique.

The problem is that, if drop_list is not empty, then at the first line, the list x is shortened (duplicates are removed), but then only a prefix of the lists y and z is dealt with (since len(x) decreased at the first line).

On the other code that code in the generation of drop_list is a bit gauche

if z[i] != z[j]:
.....
elif z[i] == z[j]:
    drop_list.append(j)

Like there was a third case? the elif could be replaced by else in my opinion.

Indeed, it would save a useless computation to just write else.

kiwifb commented 9 years ago
comment:36

Replying to @sagetrac-tmonteil:

Replying to @kiwifb:

I don't think the stuff with drop_list is spurious. The point is to eliminate double entries. Indices are added to drop_list if the corresponding point is identical to a previous one. The code you quote builds a new set of x, y and z where each set of point is unique.

The problem is that, if drop_list is not empty, then at the first line, the list x is shortened (duplicates are removed), but then only a prefix of the lists y and z is dealt with (since len(x) decreased at the first line).

Good spotting I understand what you mean now. The quick fix is to use len(y) and len(z) or a pre-set auxiliary variable as you have done some other places.

kiwifb commented 9 years ago
comment:37

OK I spent some time over the scipy documentation and I think I will try to use scipy delaunay and interpolation instead. It looks doable. There may still be some differences in rendering because the back end for delaunay is different. It may take me a couple of days before getting there.

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:38

Replying to @kiwifb:

OK I spent some time over the scipy documentation and I think I will try to use scipy delaunay and interpolation instead. It looks doable. There may still be some differences in rendering because the back end for delaunay is different. It may take me a couple of days before getting there.

Great ! I will finish some cleanup of list_plot3d.py meanwhile.

kiwifb commented 9 years ago
comment:39

I tried to post a comment on Tuesday but trac wouldn't let me :( so in scipy we can produce a Delaunay object and use it for linear interpolation but the nearest neighbor method cannot use a Delaunay object just a list of points. The linear interpolation can use the same kind of list of points. So I think I will give the delaunay object the hard shoulder and go directly to the list of points for both.

edd8e884-f507-429a-b577-5d554626c0fe commented 9 years ago
comment:40

Replying to @kiwifb:

I tried to post a comment on Tuesday but trac wouldn't let me :( so in scipy we can produce a Delaunay object and use it for linear interpolation but the nearest neighbor method cannot use a Delaunay object just a list of points. The linear interpolation can use the same kind of list of points. So I think I will give the delaunay object the hard shoulder and go directly to the list of points for both.

I definitely agree with that: using Delaunay triangulations eases finding nearest neighbors, but nearest neighbors can be found without that, see also this comment. The only loss we could have not using an underlying Delaunay triangulation should be with respect to speed (though computing Delaunay triangulation also costs, even with qhull). Do you plan to use scipy.interpolate.NearestNDInterpolator (or scipy.interpolate.griddata) for nn interpolation ? If yes, it seems to rely on scipy.spatial.cKDTree which seems to be a method unrelated to relying on a Delaunay triangulation, perhaps is it faster ?

Also, it could be nice to keep both matplotlib's and scipy's linear and cubic interpolations in order to compare them (both visually and in terms of speed), at least for now. Depending on that, we could give the whole job to scipy.

kiwifb commented 9 years ago
comment:41

Delaunay would be especially useful for speed if we were re-using it. But that's not the case. I was planning on using scipy.interpolate.NearestNDInterpolator. I will try to put everything at first so we can test as you suggest. So there will be MPLlinear and SPlinear I will investigate using griddata too now that you suggested it.

I am caught in things but hopefully I can get a decent shot at it today.

kiwifb commented 9 years ago
comment:42

OK so scipy and MPL linear interpolation look pretty much the same on screen but scipy's nn is ugly and all blocky. scipy's linear is the fastest

sage: %timeit P = list_plot3d(l, interpolation_type='linear', texture="automatic", point_list=True)
100 loops, best of 3: 6.6 ms per loop
sage: %timeit P = list_plot3d(l, interpolation_type='MPLlinear', texture="automatic", point_list=True)
100 loops, best of 3: 7.71 ms per loop
sage: %timeit P = list_plot3d(l, interpolation_type='nn', texture="automatic", point_list=True)
100 loops, best of 3: 4.15 ms per loop

The first one is scipy's linear. nn is fast but blocky