enthought / chaco

Chaco is a Python package for building interactive and custom 2-D plots.
http://docs.enthought.com/chaco/
Other
292 stars 99 forks source link

Plotting data with nans causes numpy warnings #834

Closed nicolasap-dm closed 1 year ago

nicolasap-dm commented 2 years ago

Problem Description

There is a (quite secondary) method in DataRange1D that will cause numpy to emit a warning for every scatter plot creation if nans are present. I'm not sure if it's a design principle explicitly claimed in writing, but Chaco seems to be built to be tolerant to nans in data, so it feels like nans should not cause warnings.

Reproduction Steps:

Running the following:

from numpy import sort
from numpy.random import random
from enable.api import Component, ComponentEditor
from traits.api import HasTraits, Instance
from traitsui.api import UItem, View
from chaco.api import ArrayPlotData, Plot

class Demo(HasTraits):
    plot = Instance(Component)
    traits_view = View(
        UItem("plot", editor=ComponentEditor(size=(650, 650))),
        resizable=True)

    def _plot_default(self):
        x, y = sort(random(50)), random(50)
        x[5] = float("nan")
        y[5] = float("nan")

        pd = ArrayPlotData()
        pd.set_data("index", x)
        pd.set_data("value", y)
        plot = Plot(pd)
        plot.plot(("index", "value"), type="scatter")
        return plot

Demo().configure_traits()

works well, but emits:

/home/ndemitri/.edm/envs/ds-msrl-dev/lib/python3.6/site-packages/chaco/data_range_1d.py:126: RuntimeWarning: invalid value encountered in greater_equal
  return (data.view(ndarray) >= self._low_value) & (
/home/ndemitri/.edm/envs/ds-msrl-dev/lib/python3.6/site-packages/chaco/data_range_1d.py:127: RuntimeWarning: invalid value encountered in less_equal
  data.view(ndarray) <= self._high_value

(Note: the same happens when running the nans_plot example verbatim)

Expected behavior:

No warning should be emitted.


Note: the warning is emitted here:

https://github.com/enthought/chaco/blob/8a8162f465ce7c4b6a17eef36a2017a38c1a4201/chaco/data_range_1d.py#L126-L127

and is the result of comparing (twice) a nan-containing array with a float. The warning makes sense in general numpy usage, but the thing it is warning us against seems to be fine for DataRange1D use case: nans value get a False mask, i.e. they are not within the plotted range.