[Bug]: Making an RGB image from pickled data throws error

scottshambaugh commented 4 days ago

Bug summary

Getting an error when saving an animated RGB image that was loaded from a pickled figure. I've isolated the error to matplotlib 3.9.0, with this code working in 3.8.3, which makes me think that this is to do with the pybind11 upgrade in https://github.com/matplotlib/matplotlib/pull/26275?

Things I've tried:

Grayscale images (eg data = np.random.rand(100, 100)) work.
Numpy v1.26.4 and v2.0.0 show no difference in behavior
This shows up at least on WSL and Ubuntu
In the debugger, both data.dtype and out.dtype are showing 'float64' prior to the _image.resample call.
- However, if I re-cast the arrays with data = data.astype('float64'), out = ..., then the _image.resample call no longer fails!
  - If I re-cast only one, then out.dtype == data.dtype returns True, but on the function call I get the error ValueError: Input and output arrays have mismatched types
  - ... so something is up with the types, and the C++ code is bombing. But python is saying things line up.

See these parts of the source:

https://github.com/matplotlib/matplotlib/blob/d7d1bba818ef36b2475b5d73cad6394841710211/lib/matplotlib/image.py#L205-L213 https://github.com/matplotlib/matplotlib/blob/d7d1bba818ef36b2475b5d73cad6394841710211/src/_image_wrapper.cpp#L174-L199

Code for reproduction

import io
import pickle
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
from matplotlib.animation import FuncAnimation

dir = Path(__file__).parent.resolve()

# generate random rgb data
fig, ax = plt.subplots()
np.random.seed(0)
data = np.random.rand(100, 100, 3)
ax.imshow(data)

# pick the figure and reload
buf = io.BytesIO()
pickle.dump(fig, buf)
buf.seek(0)
fig_pickled = pickle.load(buf)

# Animate
def update(frame):
    return ax,

ani = FuncAnimation(fig_pickled, update, frames=2)

# Save the animation
filepath = dir / 'test.gif' 
ani.save(filepath)

Actual outcome

Exception has occurred: ValueError
arrays must be of dtype byte, short, float32 or float64
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/image.py", line 208, in _resample
    _image.resample(data, out, transform,
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/image.py", line 567, in _make_image
    output = _resample(  # resample rgb channels
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/image.py", line 952, in make_image
    return self._make_image(self._A, bbox, transformed_bbox, clip,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/image.py", line 653, in draw
    im, l, b, trans = self.make_image(
                      ^^^^^^^^^^^^^^^^
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/artist.py", line 72, in draw_wrapper
    return draw(artist, renderer)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/image.py", line 132, in _draw_list_compositing_images
    a.draw(renderer)
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/axes/_base.py", line 3110, in draw
    mimage._draw_list_compositing_images(
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/artist.py", line 72, in draw_wrapper
    return draw(artist, renderer)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/image.py", line 132, in _draw_list_compositing_images
    a.draw(renderer)
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/figure.py", line 3157, in draw
    mimage._draw_list_compositing_images(
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/artist.py", line 72, in draw_wrapper
    return draw(artist, renderer)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/artist.py", line 95, in draw_wrapper
    result = draw(artist, renderer, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/backends/backend_agg.py", line 387, in draw
    self.figure.draw(self.renderer)
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/backends/backend_agg.py", line 432, in print_raw
    FigureCanvasAgg.draw(self)
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/backend_bases.py", line 2054, in <lambda>
    print_method = functools.wraps(meth)(lambda *args, **kwargs: meth(
                                                                 ^^^^^
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/backend_bases.py", line 2204, in print_figure
    result = print_method(
             ^^^^^^^^^^^^^
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/backends/backend_qtagg.py", line 75, in print_figure
    super().print_figure(*args, **kwargs)
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/figure.py", line 3390, in savefig
    self.canvas.print_figure(fname, **kwargs)
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/animation.py", line 371, in grab_frame
    self.fig.savefig(self._proc.stdin, format=self.frame_format,
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/lib/matplotlib/animation.py", line 1109, in save
    writer.grab_frame(**savefig_kwargs)
  File "/mnt/c/Users/Scott/Documents/Documents/Coding/matplotlib/_test_pybind11_error.py", line 35, in <module>
    ani.save(filepath)
ValueError: arrays must be of dtype byte, short, float32 or float64

Matplotlib Version

3.9.0

scottshambaugh commented 4 days ago

I am able to work around this issue by manually re-casting the image data prior to the call, so my hunch is that this is an error to do with the pickling:

Updated example with workaround:

import io
import pickle
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
from pathlib import Path
from matplotlib.animation import FuncAnimation

dir = Path(__file__).parent.resolve()

# generate random rgb data
fig, ax = plt.subplots()
np.random.seed(0)
data = np.random.rand(100, 100, 3)
ax.imshow(data)

# pick the figure and reload
buf = io.BytesIO()
pickle.dump(fig, buf)
buf.seek(0)
fig_pickled = pickle.load(buf)

# Workaround
ax = fig_pickled.get_axes()[0]
artists = ax.get_children()
for artist in artists:
    if isinstance(artist, mpl.image.AxesImage):
        array = artist.get_array()
        artist.set_array(array.data.astype('float64'))

# Animate
def update(frame):
    return ax,

ani = FuncAnimation(fig_pickled, update, frames=2)

# Save the animation
filepath = dir / 'test.gif' 
ani.save(filepath)

ianthomas23 commented 4 days ago

I can reproduce this on macOS without animation using:

import io
import numpy as np
import matplotlib.pyplot as plt
import pickle

fig, ax = plt.subplots()

rng = np.random.default_rng(4181)
data = rng.uniform(size=(2, 2, 3))
axes_image = ax.imshow(data)
print(axes_image._A.shape, axes_image._A.dtype)
im = axes_image.make_image(None)[0]

buf = io.BytesIO()
pickle.dump(axes_image, buf)
buf.seek(0)
axes_image2 = pickle.load(buf)
print(axes_image2._A.shape, axes_image2._A.dtype)

#axes_image2._A = axes_image2._A.astype("float64")
print("Same dtype?", axes_image._A.dtype == axes_image2._A.dtype)

im = axes_image2.make_image(None)[0]

Using this you get a ValueError: arrays must be of dtype byte, short, float32 or float64. If you remove the # to force a dtype change it works fine.

The problem occurs on this line https://github.com/matplotlib/matplotlib/blob/d7d1bba818ef36b2475b5d73cad6394841710211/src/_image_wrapper.cpp#L189 After pickling and unpickling the numpy array dtype is fine from a Python point of view, but from a C++ pybind11 point of view the dtype has all the right properties but its PyObject has a different address so we conclude that it is not really a double (i.e. np.float64) dtype. I haven't got any further than this yet, but If my analysis is correct it should be possible to write a reproducer that doesn't use Matplotlib at all.

tacaswell commented 3 days ago

Can we fallback to eq in the c++ code instead of is ? A version of this is reproducible without pickle:

import matplotlib.pyplot as plt
import numpy as np

fig, ax = plt.subplots()

rng = np.random.default_rng(4181)
data = rng.uniform(size=(2, 2, 3)).astype(np.dtype('float64', copy=True))
axes_image = ax.imshow(data)
print(axes_image._A.shape, axes_image._A.dtype)
im = axes_image.make_image(None)[0]

ianthomas23 commented 3 days ago

Can we fallback to eq in the c++ code instead of is ?

It looks like dtype1.equal(dtype2) is good.

matplotlib / matplotlib