KristofferC / PGFPlotsX.jl

Plots in Julia using the PGFPlots LaTeX package
Other
301 stars 40 forks source link

Unused columns create problems when using Table to create figure #272

Closed clintonTE closed 2 years ago

clintonTE commented 3 years ago

I'll note that this issue manifests in a number of different ways, including different error messages and silent corruption. The error only occurs when the offending column is before the last column used in the figure.

using DataFrames, Dates, PGFPlotsX, Colors, LaTeXStrings

function measureratiofigure(;)

    measure = DataFrame(date=lastdayofmonth.(Date(2010,1,1):Month(1):Date(2011,11,11)))
    measure[!, :obj] = [Dict(:a=>rand(), :b=>rand()) for i in 1:nrow(measure)]
    measure[!, :percentrat] = 1:nrow(measure) .|> (x)-> rand()/x

    #NOTE: if you uncomment this line, the code will work
    #measure=measure[:, [:date, :percentrat]]

  figureratio = @pgf Axis(
  {
    height="18cm",
      width = "24cm",
        date_coordinates_in = "x",
    xticklabel={"\\year"},
    ylabel=raw"(\%)",
    legend_pos  = "south east",

  },
  PlotInc({
    no_marks,
    },
  Table(
    {
      x="date",
      y="percentrat",
    },
    measure,
  )),
  LegendEntry(raw"series"),
  )

    display(figureratio)
end

measureratiofigure()

Output:

ERROR: LoadError: ArgumentError: No tex function available for data of type Dict{Symbol, Float64}. Define one by overloading print_tex(io::IO, data::T) where T is the type of the data to dispatch on.
tpapp commented 3 years ago

This is a consequence of PGFPlotsX dumping the whole Dataframe within Table --- it does not form a subset based on x and y. While this is fixed, you can use

Table(measure.data, measure.percentrat)

as a workaround.

(The culprit is this line, I will make a PR fixing it).

clintonTE commented 3 years ago

Not part of the issue, but this package makes the figures for my thesis look amazing. Really a cut above anything else I have tried.

tpapp commented 2 years ago

Sorry for the long delay. I am not entirely sure that this is an issue with PGFPlotsX. In particular, you don't have unused columns in your data, you have columns which themselves contain Dicts:

julia> first(measure)
DataFrameRow
 Row │ date        obj                               percentrat
     │ Date        Dict…                             Float64
─────┼──────────────────────────────────────────────────────────
   1 │ 2010-01-31  Dict(:a=>0.187707, :b=>0.429964)    0.404749

PGFPlotsX does not know how to deal with this, so it just gives an error.

We could check input early (for strings, numbers, missing, etc) but I feel that would go against the design of this package: after all, a user could define a method for print_tex for whatever type they want and have it work fine.