livebook-dev / kino_vega_lite

Vega-Lite (graphics) integration for Livebook
Apache License 2.0
26 stars 8 forks source link

Data truncated when using a VegaLite chart #14

Closed erszcz closed 2 years ago

erszcz commented 2 years ago

Hi! First of all thanks for LiveBook - it's a tool with huge potential!

I'm running into an issue generating charts. My livebook is available at https://gist.github.com/erszcz/4d43a77464c87a514e71eecf2811af63#file-dialyzer-etc-gradualizer-livemd and a version with persisted outputs at https://gist.github.com/erszcz/4d43a77464c87a514e71eecf2811af63#file-dialyzer-etc-gradualizer-persisted-outputs-2022-07-14_103242-livemd.

As can be seen in one of the sections, I do a sanity check of the series lengths:

# Sanity check!
[103, 16, 86, 11] = [
  by_test_type["should_pass"] |> Enum.filter(fn e -> e == "all tests" end) |> length,
  by_test_type["known_problems_should_pass"]
  |> Enum.filter(fn e -> e == "all tests" end)
  |> length,
  by_test_type["should_fail"] |> Enum.filter(fn e -> e == "all tests" end) |> length,
  by_test_type["known_problems_should_fail"]
  |> Enum.filter(fn e -> e == "all tests" end)
  |> length
]
[103, 16, 86, 11]

However, as can be seen in the persisted VegaLite JSON, the data is truncated to 12 "all tests" entries:

$ jq . vl.should_pass.json | grep "all tests" | wc -l
      12

The series should be 103 entries long, as shown in the Elixir snippet. Something similar happens for the 2 subsequent graphs, too. The only one that's drawn with the correct number of entries is the 4th one, i.e. known_problems_should_fail. The data length in its case is just 11 (as seen in the sanity check) - it's the shortest series. Could there be a limit on data length passed to VegaLite?

I'm running LiveBook in Docker, the current latest tag:

$ docker run -p 8080:8080 -p 8081:8081 --pull always -v $PWD:/data livebook/livebook
latest: Pulling from livebook/livebook
Digest: sha256:3cbc5ea39883d72a375f8ef3ceb3b35733b43a17a8119a5b35819277e2cd3e61
Status: Image is up to date for livebook/livebook:latest
[Livebook] Application running at http://0.0.0.0:8080/?token=e43z4b3q22srebh6vhslmt7ie4okizpu

Am I misusing something or might there by a bug lurking here?

josevalim commented 2 years ago

I believe the issue is that the format you are passing assumes that each key is column of the same table, so it expects all of them have the same rows. If it doesn’t, it trims to the shortest.

@jonatanklosko should table warn or raise here?

@erszcz you should do a Map.take/2 before you pass the data to VegaLite.

jonatanklosko commented 2 years ago

@josevalim that's it, we recognise it as tabular data, but strictly speaking it isn't because of the non-matching lengths, so zipping columns into rows truncates to the shortest. I think ideally Table.Reader.init should return :none, but this requires counting each column, so an error may be a better option.

jonatanklosko commented 2 years ago

Although an error only helps for to_rows, to_colums will still return a map of non-matching columns :thinking:

erszcz commented 2 years ago

Great, thank you for a prompt explanation! I've found a way around it. Thanks again for an awesome tool 🤩