has2k1 / mizani

A scales package for python
https://mizani.readthedocs.io
BSD 3-Clause "New" or "Revised" License
49 stars 14 forks source link

"ValueError: assignment destination is read-only" in .bounds.squish_infinite() with pandas CoW #38

Closed khaeru closed 8 months ago

khaeru commented 8 months ago

We began to see issues today in GitHub Actions runs like this one that boil down to:


  File "/tmp/ipykernel_8039/2786353714.py", line 32, in save_plot
    obj.save("westeros_report.pdf", verbose=False)
    sv = self.save_helper(
         ^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/plotnine/ggplot.py", line 610, in save_helper
    figure = self.draw(show=False)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/plotnine/ggplot.py", line 279, in draw
    self._draw_layers()
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/plotnine/ggplot.py", line 442, in _draw_layers
    self.layers.draw(self.layout, self.coordinates)
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/plotnine/layer.py", line 459, in draw
    l.draw(layout, coord)
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/plotnine/layer.py", line 367, in draw
    self.geom.draw_layer(self.data, layout, coord, **params)
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/plotnine/geoms/geom.py", line 289, in draw_layer
    self.draw_panel(pdata, panel_params, coord, ax, **params)
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/plotnine/geoms/geom_path.py", line 148, in draw_panel
    self.draw_group(gdata, panel_params, coord, ax, **params)
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/plotnine/geoms/geom_path.py", line 158, in draw_group
    data = coord.transform(data, panel_params, munch=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/plotnine/coords/coord_cartesian.py", line 61, in transform
    return transform_position(data, squish_infinite_x, squish_infinite_y)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/plotnine/positions/position.py", line 127, in transform_position
    data[xs] = data[xs].apply(trans_x)
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/pandas/core/frame.py", line 10034, in apply
    return op.apply().__finalize__(self, method="apply")
           ^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/pandas/core/apply.py", line 837, in apply
    return self.apply_standard()
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/pandas/core/apply.py", line 965, in apply_standard
    results, res_index = self.apply_series_generator()
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/pandas/core/apply.py", line 981, in apply_series_generator
    results[i] = self.func(v, *self.args, **self.kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/plotnine/coords/coord_cartesian.py", line 56, in squish_infinite_x
    return squish_infinite(col, range=panel_params.x.range)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/hostedtoolcache/Python/3.11.8/x64/lib/python3.11/site-packages/mizani/bounds.py", line 242, in squish_infinite
    _x[np.isneginf(_x)] = range[0]
    ~~^^^^^^^^^^^^^^^^^
ValueError: assignment destination is read-only

My colleague @glatterf42 helpfully bisected and narrowed down the cause, which I summarize here:

So I realize this issue is about a migration that mizani may have to make in the future, but at the moment it prevents simultaneous use of plotnine and dask[dataframe].

We will try to hack up a work-around (maybe forcibly disable copy-on-write just before a call to plotnine.ggplot.save()?) but wanted to give a heads-up.

phofl commented 8 months ago

Pandas is moving towards "copy-on-write" (CoW šŸ„), which will be default starting in version 3.0 (I am not sure when that will be released).

current schedule is late April / early may

You can make the data writeable again if you are certain that you don't modify inplace with

arr.flags.writeable = True
has2k1 commented 8 months ago

I'm aware of pandas 3 and hope to sort this out by then. I hadn't thought about temporarily disabling COW.

I think I am going to make a release that in interim disables COW.

khaeru commented 8 months ago

Would it be possible to use numpy.nan_to_num() ?

return np.nan_to_num(np.asarray(x), nan=np.nan, posinf=range[1], neginf=range[0])
glatterf42 commented 8 months ago

Would it be possible to use numpy.nan_to_num() ?

return np.nan_to_num(np.asarray(x), nan=np.nan, posinf=range[1], neginf=range[0])

I don't think so, this is causing the same error as above.

phofl commented 8 months ago

You can add copy=True if you want to be able to modify the array in place

has2k1 commented 8 months ago

This has been resolved in plotnine v0.13.2.