pnezis / tucan

An Elixir plotting library on top of VegaLite
https://hexdocs.pm/tucan/Tucan.html
MIT License
153 stars 2 forks source link

Plotting DataFrames with `Elixir.Time` columns #22

Open spencerkent opened 1 week ago

spencerkent commented 1 week ago

Hey, thanks for the useful library!

I was surprised to get a blank graph in response to the following:

iex> times = 0..10 |> Enum.map(&(DateTime.utc_now() |> DateTime.to_time() |> Time.add(&1)))
[~T[05:33:08.404477], ~T[05:33:09.404555], ~T[05:33:10.404564], ~T[05:33:11.404577],
 ~T[05:33:12.404579], ~T[05:33:13.404581], ~T[05:33:14.404583], ~T[05:33:15.404586],
 ~T[05:33:16.404588], ~T[05:33:17.404590], ~T[05:33:18.404592]]
iex> test_df = DF.new(
  %{
      time_col: times,
      value_col: 10..20 |> Enum.to_list()
    }
)
#Explorer.DataFrame<
  Polars[11 x 2]
  time_col time [05:33:08.404477, 05:33:09.404555, 05:33:10.404564, 05:33:11.404577,
   05:33:12.404579, ...]
  value_col s64 [10, 11, 12, 13, 14, ...]
>
Tucan.lineplot(test_df, "time_col", "value_col")
Screenshot 2024-06-20 at 11 40 20 PM

Providing the temporal-type argument does not yield anything different:

Tucan.lineplot(test_df, "time_col", "value_col", x: [type: :temporal])

Am I missing a configuration parameter that's needed for Tucan to show the plot?

pnezis commented 1 week ago

Hi @spencerkent there are two issues here:

  1. You are passing Time structs which vega lite does not know how to handle. Instead you should pass timestamps in a string format vega lite can decode, e.g. ISO8601. For more details about the expected time formats check here. There is an option to set the format of the timestamps for non ISO timestamps but this is not supported currently in tucan.

  2. Vega-Lite supports up to millisecond accuracy, utc_now includes by default :microseconds.

You could do the following:

now = DateTime.utc_now(:second)

times =
  0..10
  |> Enum.map(fn x -> DateTime.add(now, x, :second) end)
  |> Enum.map(&DateTime.to_iso8601/1)

y = 10..20 |> Enum.to_list()

Tucan.lineplot([t: t, y: y], "t", "y", x: [type: :temporal, axis: [format: "%H:%M:%S"]])

the axis part is optional in order to force displaying the whole timestamp. This will give you the following plot:

image
spencerkent commented 1 week ago

Okay, thanks! Tucan makes using Vega Lite a lot easier—it might be in that same spirit for Tucan to handle this conversion for us (having to know the details of Vega Lite's timestamp handling kind of defeats the purpose of Tucan's abstraction).

pnezis commented 1 week ago

@spencerkent correction no explicit casting is needed if the you pass DateTime. the following works:

now = DateTime.utc_now()

t = Enum.map(0..10, fn x -> DateTime.add(now, x, :second) end)
y = Enum.to_list(10..20)

Tucan.lineplot([t: t, y: y], "t", "y", x: [type: :temporal])

The issue is present only if you pass %Time{} items. In such case you need to explicitly define the casting.

We could support something like this that automatically sets the proper format to the dataset:

Tucan.lineplot([t: t, y: y], "t", "y", x: [type: :temporal], format: [t: :time])

WDYT?

spencerkent commented 1 day ago

Since "temporal" to me means either a DateTime or a Time, it feels clunky to have to handle Times by adding the extra argument like you show, it would be nice to have the same user interface across the two types (handling the conversion from an Elixir Time to the format VegaLite expects under the hood.

pnezis commented 1 day ago

The issue here is that VegaLite itself does not handle times unless you explicitly specify the format.