pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.55k stars 1.07k forks source link

Colormap Normalisation Giving Unexpected/Incorrect Output #4061

Open rjp23 opened 4 years ago

rjp23 commented 4 years ago

The behaviour when specifying "norm" for a colormap normalisation does not work as anticipated.

Below I use the example code from matplotlib and apply the same normalisation to the DataArray version of the data but get very different results.

MCVE Code Sample

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as colors
import xarray 

#example from https://matplotlib.org/3.1.1/tutorials/colors/colormapnorms.html
#for colormap normalisation

N = 100
X, Y = np.mgrid[-3:3:complex(0, N), -2:2:complex(0, N)]
Z1 = np.exp(-X**2 - Y**2)
Z2 = np.exp(-(X - 1)**2 - (Y - 1)**2)
Z = (Z1 - Z2) * 2

fig, ax = plt.subplots(2, 1, figsize=(8, 8))
ax = ax.flatten()

bounds = np.linspace(-1, 1, 10)
norm = colors.BoundaryNorm(boundaries=bounds, ncolors=256)

ax[0].pcolormesh(X, Y, Z,
                       norm=norm,
                       cmap='RdBu_r')

#now add data into dataset and plot it using same normalisation
data = xarray.DataArray(Z, dims=('x', 'y'), coords={'x': X[:,0], 'y': Y[0,:]})
data.plot(ax=ax[1], x='x', y='y', norm=norm, add_colorbar=False)

plt.show()

Expected Output

Top is expected, bottom is actual

Screenshot 2020-05-14 at 12 00 18

Problem Description

Colormap normalisation appears to be broken in xarray

Versions

Output of xr.show_versions() xarray.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-957.21.3.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.15.1 pandas: 0.25.1 numpy: 1.17.2 scipy: 1.3.1 netCDF4: 1.5.1.2 pydap: None h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: 1.0.28 cfgrib: None iris: 2.2.0 bottleneck: None dask: 2.5.2 distributed: 2.5.2 matplotlib: 3.1.1 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: None setuptools: 41.4.0 pip: 19.2.3 conda: None pytest: None IPython: 7.8.0 sphinx: None
rjp23 commented 4 years ago

It might be useful to note here that my solution for now was just to pass the arrays directly to pcolormesh rather than going through the xarray plot interface.

i.e. I changed from

img = ds['var'].plot(ax=ax, cmap=cmap, norm=norm) to img = ax.pcolormesh(ds.x.values, ds.y.values, ds.tas.values, cmap=cmap, norm=norm)

mathause commented 4 years ago

What should also work is img = ds['var'].plot(ax=ax, cmap=cmap, levels=levels)

mathause commented 4 years ago

If you use

norm = colors.BoundaryNorm(boundaries=bounds, ncolors=9)

it works. A new norm with ncolors=9 is built in _build_discrete_cmap - however, it is not used as norm is already defined:

https://github.com/pydata/xarray/blob/2542a63f6ebed1a464af7fc74b9f3bf302925803/xarray/plot/utils.py#L290-L292

The fix might be to use:

 if levels is not None or isinstance(norm, mpl.colors.BoundaryNorm): 
     cmap, norm = _build_discrete_cmap(cmap, levels, extend, filled) 

this breaks one test which is probably fixable (it tests that the norm is not changed when it is given).

Note that mpl seems to use a LinearSegmentedColormap while xarray creates a ListedColormap.

rjp23 commented 4 years ago

Neither of those solutions seem to work for diverging colormaps where the data extends outside the range, i.e. the < and > data ends up with no colour.

mathause commented 4 years ago

That's probably gonna need a extend="both"

dcherian commented 4 years ago

Seems like we should do whatever mpl does when given a norm and a colormap

huaracheguarache commented 2 years ago

I also have an issue where xarray doesn't produce the correct plot when normalizing with BoundaryNorm:

import xarray as xr
import matplotlib.pyplot as plt
import matplotlib.colors as colors
from cmcrameri import cm

airtemps = xr.tutorial.open_dataset("air_temperature")

# Convert to Celsius.
air = airtemps.air - 273.15
air.attrs = airtemps.air.attrs
air.attrs["units"] = "deg C"

# Select a timestep.
air2d = air.isel(time=500)

# Plotting discrete bounds with matplotlib works fine.
bounds = [x for x in range(-30, 31, 10)]
norm = colors.BoundaryNorm(boundaries=bounds, extend='both', ncolors=cm.vik.N)

fig, ax = plt.subplots()
cs = ax.pcolormesh(air2d.lon, air2d.lat, air2d, cmap=cm.vik, norm=norm)
fig.colorbar(cs)
plt.show()

# Plotting with xarray doesn't work.
fig, ax = plt.subplots()
air2d.plot.pcolormesh(ax=ax, norm=norm)
plt.show()

First one is from matplotlib: matplotlib

Second one is from xarray: xarray

I also get the following traceback after running the script:

Traceback (most recent call last):
  File "/home/michael/miniconda3/envs/testing_xarray/lib/python3.10/site-packages/matplotlib/cbook/__init__.py", line 287, in process
    func(*args, **kwargs)
  File "/home/michael/miniconda3/envs/testing_xarray/lib/python3.10/site-packages/matplotlib/backend_bases.py", line 3056, in mouse_move
    s = self._mouse_event_to_message(event)
  File "/home/michael/miniconda3/envs/testing_xarray/lib/python3.10/site-packages/matplotlib/backend_bases.py", line 3048, in _mouse_event_to_message
    data_str = a.format_cursor_data(data).rstrip()
  File "/home/michael/miniconda3/envs/testing_xarray/lib/python3.10/site-packages/matplotlib/artist.py", line 1282, in format_cursor_data
    neighbors = self.norm.inverse(
  File "/home/michael/miniconda3/envs/testing_xarray/lib/python3.10/site-packages/matplotlib/colors.py", line 1832, in inverse
    raise ValueError("BoundaryNorm is not invertible")
ValueError: BoundaryNorm is not invertible
Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.10.0 | packaged by conda-forge | (default, Nov 20 2021, 02:25:18) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 5.14.18-300.fc35.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: ('en_GB', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 0.20.1 pandas: 1.3.4 numpy: 1.21.4 scipy: 1.7.2 netCDF4: 1.5.8 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.5.1.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.5.0 cartopy: 0.20.1 seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None setuptools: 59.2.0 pip: 21.3.1 conda: None pytest: None IPython: 7.29.0 sphinx: None
veenstrajelmer commented 1 year ago

As suggested by https://github.com/pydata/xarray/pull/7553#discussion_r1117264787, pass levels=bounds instead of norm=norm to data.plot(). Your example code results in a plot as expected. Would this solve your issue?

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as colors
import xarray 

#example from https://matplotlib.org/3.1.1/tutorials/colors/colormapnorms.html
#for colormap normalisation

N = 100
X, Y = np.mgrid[-3:3:complex(0, N), -2:2:complex(0, N)]
Z1 = np.exp(-X**2 - Y**2)
Z2 = np.exp(-(X - 1)**2 - (Y - 1)**2)
Z = (Z1 - Z2) * 2

fig, ax = plt.subplots(2, 1, figsize=(8, 8))
ax = ax.flatten()

bounds = np.linspace(-1, 1, 10)
norm = colors.BoundaryNorm(boundaries=bounds, ncolors=256)

ax[0].pcolormesh(X, Y, Z, norm=norm, cmap='RdBu_r')

#now add data into dataset and plot it using same normalisation
data = xarray.DataArray(Z, dims=('x', 'y'), coords={'x': X[:,0], 'y': Y[0,:]})
data.plot(ax=ax[1], x='x', y='y', levels=bounds, add_colorbar=False)

image

veenstrajelmer commented 1 year ago

@rjp23: could you close the issue if this indeed resolves your problem?

rjp23 commented 1 year ago

I’m not sure that’s a solution if we argue that xarray should do what matplotlib does with the same keywords.

I can test if this “works” and will report back but it’s still not a fix.

veenstrajelmer commented 1 year ago

@rjp23 with the latest update to the PR (thanks to @jklymak), your example code produced identical figures without changing it.