vincentarelbundock / tinytable

Simple and Customizable Tables in `R`
https://vincentarelbundock.github.io/tinytable
GNU General Public License v3.0
196 stars 16 forks source link

Issue#340 #341

Closed J-Moravec closed 1 week ago

J-Moravec commented 2 weeks ago

Fix for #340.

The current implementation simply translates the png images into base64encoded string within the save_tt() function. The option portable is marked as experimental.

A better implementation might utilize the lazy eval of the plots and translate them during the building process. This would certainly avoid many different bugs. The proper fix should likely be somewhere in the build_tt.R, or even in plot_tt.R and using some way of passing the "portable" parameter along, so that we are sure that only the generated images are encoded. But that would require quite a bit more knowledge from me about the structure of the code.

Let's have this as a conversation starter about what would be the best way to fix it.

Issues:

Do to these issues, please do not merge yet.

All relevant tests are passing (but unrelated tests for markdown and pdf seem to be not passing, but their code paths were not modified)


Some contributor instructions would be helpful.

vincentarelbundock commented 1 week ago

Oooh, this is amazing! Thanks for taking the time to put this together.

I'm having a crazy week at work so won't have to look at this right now. But at first glance, one thing I'll say is that I don't think we should add a new argument for this, since "portable" only applies to a single output format --- it is not a modifier for LaTeX, Typst, etc. So I would lean toward something like output="html_portable" or similar.

J-Moravec commented 1 week ago

Thanks for your kind words. I had the above solution so there was no reason why not to contribute.

That is a good idea about html_portable, and will make for cleaner logic.

If I am reading and understanding the code correctly, the plots are created only when an object is forced to evaluate, either by print method or by explicitely calling save_tt().

Alternatively, (and with html_portable no other arg would be required) one could include the logic of translating images into the base64 encoding directly in the plot_tt_lazy.

And yet another option would be to make the current brute-force approach just much more efficient, able to take any image, even user provided, and just force the generated images to be created in a tempdir() as to not have to delete them and potentially user-provided images, but pack every single image, even including those from the internet.

I kind of like the idea of fully portable html with all packed images, rather than semi portable one. I will look into it and modify the code accordingly.

vincentarelbundock commented 1 week ago

Cool, that sounds excellent. Yes, obviously not deleting the user's images is crucial. I don't have a strong view on the other options since I haven't looked at this carefully, but doing all this in plot_tt_lazy() feels write from a code organization perspective.

Note: If we're using an extra package to convert the encoding, I'd like that package to be optional, meaning in the Suggests section of DESCRIPTION, and calling assert_dependency("packagename") before the chunk where it is used.

J-Moravec commented 1 week ago

There is an issue with output = "html_portable" for save_tt is if the output is supposed to be a file, such as tt(...) |> ... |> save_tt(output = "file.html"). There is no way for us to know whether the user wanted portable or non-portable HTML. So we need some way for the user to specify this anyway.

Internally, we can use output = "html_portable", but I can't think of a way how to specify to output portable HTML as a file without using additional argument (or a system-wide option). Unless portable HTMLs are assumed by default (which IMO would be reasonable).

vincentarelbundock commented 1 week ago

Yes, I agree that portable by default feels fine, as long as we can draw tables with no image without installing the suggested dependency.

J-Moravec commented 1 week ago

This is an alternative implementation that results in much simpler code.

I wasn't able to figure out how to make html_portable work without making it default or changing a large amount of code. In the end, an additional parameter is still required.

But by modifying the tinytable class to remember the x@portable = TRUE, we don't really need to change much of the other code and encode the images directly in the plot_tt_lazy. We also generate the images in tempdir(), so nothing is being deleted (tempdir() would be deleted upon exiting the current R session anyway).

Currently, only the locally available images are encoded, not remote URLs, since I had problem with the extension-less URL in the examples, this would require guessing mimetype.

All images are assumed to be of mimetype image, not sure if it will work with some special cases like PDF.

vincentarelbundock commented 1 week ago

Would it be possible to do something like this at the very top of save_tt()?

if (isTRUE(output == "html_portable")) {
  output <- "html"
  x@portable <- TRUE
}

If I understand your logic, this would allow us to carry the portable info in the slot, while avoiding the unnecessary argument. Am I getting this right?

J-Moravec commented 1 week ago

Instead of:

  if(isTRUE(portable)){
    assert_dependency("base64enc")
    x@portable <- TRUE
  }

to remove the portable argument to save_tt()?

Or in addition to it so that html_portable can be directly specified?

If instead of, and thus removing the portable argument from save_tt(), how would one then specify a portable HTML with just a filepath? I guess tt() |> save_tt("html_portable") |> slot("table_string") |> write("test.html")) is not that difficult. But it is much gulier than tt() |> save_tt("test.html", portable = TRUE)

If I understand your logic, this would allow us to carry the portable info in the slot, while avoiding the unnecessary argument. Am I getting this right?

Yes

vincentarelbundock commented 1 week ago

I thought we were going for portable by default.

Is there a convincing use-case for not making HTML portable? I suppose one might want to keep the HTML itself smaller, but that seems like a niche application, which could potentially be dealt with using a global option rather than an extra argument.

My issue with adding a portable argument is that it "pollutes" the interface for something that is rarely used, and only applies to a single output format.

What do you think?

J-Moravec commented 1 week ago

Can do. The only arguments against it might be that:

  1. it is currently an experimental feature
  2. it introduces potentially breaking changes in behaviour.
  3. it introduces new dependency for HTML output.
  4. Remote images are not packed. (coz without extension, that would require reading the magic bits)

But if you are happy with this potentially breaking behaviour, I am happy to fix the PR so that we have html_portable and save_tt(output = "*.html") defaults to html_portable.

vincentarelbundock commented 1 week ago

Thanks, those are some good arguments against. How about we do it the other way?

save_tt(output = "html_portable") does what I suggested above.

save_tt(output = "file.html") checks getOption(tinytable_html_portable, default=FALSE)

vincentarelbundock commented 1 week ago

FYI, I updated this PR to main to avoid merge conflicts in the future.

J-Moravec commented 1 week ago

Test case was failing since the merge did change something with random string generation for the CSS formatting. Snapshot was updated and locally, the tests are passing.

vincentarelbundock commented 1 week ago

Merged.

Thanks a ton for this @J-Moravec . This is a super cool feature, and I'm very grateful for the code!