JuliaTeX / PGFPlots.jl

This library uses the LaTeX package pgfplots to produce plots.
Other
188 stars 36 forks source link

Missings #136

Open alfredjmduncan opened 4 years ago

alfredjmduncan commented 4 years ago

Currently, plot functions accept data with type AbstractMatrix{Real}. This means that the code throws a MethodError when passed data that allows for or includes missing values.

When plotting multiple time series with different frequencies / time spans, there can be quite a bit of messy wrangling required before passing the data through to the plotting functions in PGFPlots.

If it were possible to accept data as AbstractMatrix{Union{Missing,Real}}, then for PGFPlots to drop the Missings for each trace before plotting, that would be much appreciated.

mykelk commented 4 years ago

Great idea! We'd welcome a PR.

alfredjmduncan commented 4 years ago

There are a few design questions. I guess the two main options are to

  1. To pass the missing s to PGFPlots as nan, which is a standard way to code missing values in PGFPlots. This would mean
    • Updating the plotHelper functions in PGFPlots.jl, then
    • updating the accepted Real / Complex types to Union{Real,Missing} / Union{Complex,Missing} throughout.

(It would also be possible to pass the missings as empty strings, which is more appealing than nans in some ways. But this only works in PGFPlots if values are delimited with commas or semicolons. In some plotHelper functions, values are currently delimited with spaces).

  1. Another option would be to just allow missings when passing a DataFrame to PGFPlots, and to just filter the missings out of the DataFrame columns provided before dispatching into the plotting functions. This would just require updating lines 45-51 of PGFPlots.jl.

(2) is a much smaller change, but drops some useful information from the resulting .tex output files. (1) would allow the user to set whether PGFPlots skips or jumps missing values, which is a useful feature in PGFPlots.

mykelk commented 4 years ago

@tawheeler Do you have a preference?

tawheeler commented 4 years ago

Julia now has core support for missing values. It seems to make sense to support that directly in PGFPlots.jl as well.

PGFPlots.jl is a weird package in that, rather than typing things, we basically don't add types to anything, and rely on the type itself to dictate how it gets serialized to text when writing to a .tex file. The data itself is an exception to this. As @alfredjmduncan points out, the data fields are of type AbstractMatrix{Real}. I like the idea of moving to Union{Missing, Real}, and then using skipmissing in plotHelper.

mykelk commented 4 years ago

Sounds good @tawheeler

alfredjmduncan commented 4 years ago

OK great! I'll have a go at the PR.