deephaven / deephaven-core

Deephaven Community Core
Other
249 stars 79 forks source link

Matplotlib/Seaborn bugs summary and plan #4644

Open alexpeters1208 opened 11 months ago

alexpeters1208 commented 11 months ago

This is a concise description of the extensive discussions that have gone on between myself, @rachelmbrubaker, @mofojed, @chipkent, @jjbrosnan, and others. Hopefully, this will help @rachelmbrubaker in collecting all of the relevant issues and discussions under one roof.

Description

Using Matplotlib from Deephaven was shown to have many bugs as of DHC 0.28.0 and DHE Vermilion. They were first outlined in a Jira ticket posted in September by @danshelley-illumon. More were discovered after this point. Here is a summary of those bugs:

  1. Calls to sns.lineplot() produce this error:

    ph-updateExecutor-10 | rumentedTableListenerBase | Uncaught exception for entry= Uninstrumented code, added.size()=1, modified.size()=0, removed.size()=0, shifted.size()=0, modifiedColumnSet={}:
    java.lang.RuntimeException: Error in Python interpreter:
    Type: <class 'TypeError'>
    Value: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
    Line: 2360
    Namespace: masked_invalid
    File: /Users/alexpeters/Documents/Core-Dev/matplotlib-dev/deephaven-core/matplotlib-dev-venv/lib/python3.11/site-packages/numpy/ma/core.py
  2. Calls to sns.scatterplot() produce this error if used on a table with 7 or more entries, or when a ticking table ticks up to 7 or more rows:

    r-Scheduler-Serial-1 | .c.ConsoleServiceGrpcImpl | Error running script: java.lang.RuntimeException: Error in Python interpreter:
    Type: <class 'TypeError'>
    Value: Cannot interpret 'Float64Dtype()' as a data type
    Line: 689
    Namespace: locator_to_legend_entries
    File: /Users/alexpeters/Documents/Core-Dev/matplotlib-dev/deephaven-core/matplotlib-dev-venv/lib/python3.11/site-packages/seaborn/utils.py
  3. Calls to both sns.lineplot() and sns.scatterplot() produce a litany of warnings that are increasingly annoying with ticking tables. Most of these warnings look like this:

    seaborn\_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead

Investigation

Investigation between myself, @jjbrosnan, and @mofojed revealed a couple of key things.

  1. There was a bug in our Pandas library that could cause some apparently non-deterministic behavior w.r.t. Matplotlib and Seaborn plots. That bug is described and resolved here. This did not resolve any of the specific errors posted above.
  2. There was a bug in Seaborn 0.12.2 related to how Seaborn was dealing with Pandas 2.0 that made many Seaborn functions throw the warnings we were seeing. That bug is described here and resolved here. This bug has been resolved as of Seaborn 0.13.0, so item number 3 from the list above is resolved.
  3. Unexpectedly, upgrading to Seaborn 0.13.0 also resolved item number 1 in the list above. This has been shown for both ticking and static tables.
  4. @mofojed reproduced item number 2 from the list above in plain Python, confirming it as a pure Seaborn bug. He filed an outstanding ticket against Seaborn here.

Additionally, @mofojed and @chipkent have expressed the opinion that all Deephaven-related Matplotlib usage should be done with Matplotlib's explicit interface, as opposed to their implicit interface, which are outlined and compared here. This will require updates to docs, examples, and tests on the community side, and likely the same on the enterprise side.

Path forward

  1. In order to make Seaborn and Deephaven play nice together, Seaborn 0.13.0 is a necessity.
  2. We need Seaborn to deal with the ticket @mofojed gifted them.
  3. The Pandas bug fix, which will be in DHC 0.29.0, is needed for stability.
  4. Existing documentation and examples need to be updated to only use Matplotlib's explicit interface. Ticket is here.
  5. Create, at minimum, a brief conceptual guide outlining the distinction between Matplotlib's implicit and explicit interfaces. As a Matplotlib user for many years, I was not consiously aware of this distinction. If we're going to insist that the explicit interface must be used with Deephaven, we need to provide a document explaining the distinction, as well as maybe some high-level details for why we need it to be this way. Ticket is here.
  6. Finally, the DHE test scripts that seem to have started this conversation need to be fixed and updated to conform to Matplotlib's explicit interface. I have a branch that fixes these scripts with some patchwork fixes. Some of these updates will be part of the final scripts, but they warrant an overhaul when these other things are done. I'm not sure who owns this DHE work.

Assigning this to @rachelmbrubaker not because she owns the work required, but because she will determine when this ticket is sufficiently resolved to close.

Current Progress as of 10/27/23

  1. Discussion in DHC ahead of 0.29.0 release on forcing Seaborn to be 0.13.0. This will require some thought from @devinrsmith, as we don't depend on Seaborn directly. No work as of yet, as far as I'm aware.
  2. No progress.
  3. COMPLETE: Fix is included in 0.29.0.
  4. COMPLETE
  5. Not planned.
  6. No progress, contingent on other things getting fixed.
mofojed commented 11 months ago

@alexpeters1208 heh, it seems others are confused about the matplotlib explicit vs. implicit interface as well:

In any case, I don't think we should be investing too much time into these. We can provide our preference for the explicit interface, but even then, for animations/ticking data we should be suggest using the deephaven-express interface instead (has better performance for ticking data).

jjbrosnan commented 11 months ago

Our documentation has been updated to not use the global scope (plt.clf(), plt.gcf(), or plt.anything()) in our docs. I will add a disclaimer to the docs as they are now warning users not to pollute the global scope. It really only matters when plotting data from multiple ticking sources, but things can go awry in confusing ways in that case.

mofojed commented 11 months ago

Another thing we may want to detail is how typings are handled. It looks like we're using the Pandas dtypes, like Float64DType, but seaborn/matplotlib do not fully support that data type. I think we're using the Float64DType so that it is nullable (is that correct @jmao-denver ?) so we should recommend in the matplotlib code to convert them to float64 type instead, and have some other way to filter out nulls.

alexpeters1208 commented 11 months ago

@rachelmbrubaker In a devrel meeting, we decided that the best move going forward is not to spend effort on creating a concept guide for Matplotlib's implicit vs. explicit interfaces. Instead, we're going to ensure that existing documentation highlights the need for using the explicit interface. There was talk of getting the code to throw a warning or an error when the implicit interface is used, but I'm not sure if it was decided if we really want that and if so, who owns that issue. Maybe @chipkent can comment. I am going to update the body of the ticket with an ongoing list of things that have been resolved or changed.

jjbrosnan commented 10 months ago

From a docs standpoint:

Once the linked PR is merged, community docs are considered fixed for the sake of this ticket.

alexpeters1208 commented 10 months ago

From a docs standpoint:

  • Our matplotlib how-to guide has been fixed to only use the explicit interface
  • Our seaborn how-to guide has been fixed to only use the explicit interface
  • There is a PR with updates to the how-to use plugins guide

Once the linked PR is merged, community docs are considered fixed for the sake of this ticket.

The PR mentioned above is merged, so I am going to update the ongoing list and mark that item as complete.

alexpeters1208 commented 8 months ago

Relates to 186. Some of the noted issues may be resolved when that ticket is addressed.