insightsengineering / rtables

Reporting tables with R
https://insightsengineering.github.io/rtables/
Other
224 stars 48 forks source link

Labels and captions #168

Closed dmenne closed 2 years ago

dmenne commented 3 years ago

I write clinical reports, and so rtables would be a great tool to use. Too bad, two absolutely required features are not available, labels (for crossreferencing) and captions.

I am curious to learn how you write reports at Roche without these, do you have a workaround?

If you want to see what should work (kable is the yardstick):

https://gist.github.com/dmenne/f8eb291c9e71a5de44764d442e8bdefd

gmbecker commented 3 years ago

So there are a couple of things here. Firstly, the concept of tables having titles is on the roadmap and expected to land this year. Once tables have titles (and possibly also ids, which would be the concept of a 'label' here), the issue of referencing them from text is going to look different than it does now. cc @waddella

For now, we already have an as_html rendering function (though our primary target output format is ASCII, currently). In the gabe_tabletree_work branch only I have added experimental and subject to change/removal (if/when we solve this a different way) caption_txt and link_label arguments to it which will allow you to use it as a work around to get the things you need in html output, including auto-numbering and working links.

Note those arguments may go away again/have defaults you never want to override when rtables is more formally modeling id (ie link label) and title (ie caption) for tables, as they will be intrinsic parts of the table itself that will just be retrieved.

I don't have a date for when this will be in a version of rtables available on CRAN, it was more proof of concept we can achieve it at all.

If you need a work around that works with the CRAN version, something like

tags <- htmltools::tags
caption_hack <- function(tt, cap_txt, lab_id) {
    innertab <- rtables:::as_html(tt)
    tags$table(tags$caption(class="Table Caption", 
                                                      sprintf("(#tab:%s) %s", lab_id, cap_txt)), 
                          tags$tr(tags$td(innertab)))
}

Should pretty much do it for you in combination with results="asis" chunks, for html output formats, from what I can see from (admittedly quite limited) testing.

dmenne commented 3 years ago

Thanks for your comments. It's a bit difficult to understand why title is more important for you than caption, but requirements are different.

In my gist, I use a similar function for tables that have no caption, but your caption hack looks more elegant. Gist was update and references this thread.

dmenne commented 3 years ago

Is there any reason why you change the names from that of other packages? id = label, title=caption? This causes a lot of confusion.

gmbecker commented 3 years ago

First off, Label means something entirely different in rtables, and I'm not convinced the general concept as we use it maps well to what the thing you are calling label is doing. That is because it was never a good name for what that does in the first place, as what it actually acts as is a unique id that acts as an anchor for links. Theres no reason to expect something called label to be unique or that it would be used as an index, as labels are for display.

As for captions, again the purpose and position that bookdown puts the captions at means that they are acting like titles, not like captions.

More generally, rtables builds tables that are more complicated than are possible in things like kable, so expecting everthing to map just as it does in the much simpler more constrained situation is not likely to be realistic.

dmenne commented 3 years ago

Ok, after I tried I see that I was mislead in thinking rtables was for markdown, but I did not find and example that can be analogized to kable etc.

Your ASCII tables are for "direct consumption" of SPSS etc users, so they look familiar. Fair enough, but different target group. I realized it when it created a new top-level chapter for me, which was triggered by the ----- line.

Thanks for your quick responses. I have removed the reference to your package from the gist.

waddella commented 3 years ago

@dmenne, thanks for your feedback and input.

rtables is still under active development and the main features missing for the submission space are around titles, footnotes, row gaps and other formatting features. There is an R consortium working group that I am part of that discusses the requirements for tables in Pharma. You can read more here:

https://github.com/RConsortium/rtrs-wg/blob/main/Requirements/working-requirements/requirements.md

The current rtables features revolve around tabulation, value access (see value_at) & ASCII output. We have yet to decide how tight the Rmarkdown integration should be. There is also gt that offers a lot of formatting options. As a first step we will probably provide a gt coercion function on our roadmap.

dmenne commented 3 years ago

Thanks for your comments, and for the links to the working group. Yesterday I posted on the gt github page on the subject of missing captions and labels for reference - planned since 2 year but stuck.

Strangely, the pharma companies I work for (your's included) always end up requesting Word output because it is easier to comment. PDF is an alternative, but mostly with formatting requirements from the last millenium (there must be a box around the items xxx when there is a box around...) that forces me to use my old rusty latex book. For one of the Swiss biggies, I needed 2/3 of the time for the formatting of the report, even in times when my latex was not so bad.

dmenne commented 3 years ago

It is really strange that cross-referencing and table numbering is not on the working group list. I find it absolutely important to be able to write "As table xx on page yy shows", where both should have a crosslink. Tables (and figures) should be seen in a context, not as standalone creatures, and cross-linking helps to achieve this.

gmbecker commented 3 years ago

I think this is because it is not really a function of the tables, it is a function of the containing document (bookdown in the case that you are promoting). It really doesn't have anything much to do with how the tables are constructed, or what is in them, right?

For context, think about how plots (the actual visualization) correspond to figures (the referenced entities in an overarching document). The captions/linking/etc dont' have anything to do with the ggplot2 objects, or the base graphics lack of even being an object, do they? I don't know of any core reason tables should be conceptualized differently than that. It is true that the way that bookdown has been designed currently they are somewhat different, but my commit yesterday showed that we can support that as needed.

In terms of markdown, pandoc markdown straight up does not support the types of multi-header-row column-spanning header structures that are a core feature of rtables (unless you count just putting the full HTML table in there, which you can do but isn't what we're talking about here). So a renderer that outputs markdown rtables generated tables generally is literally impossible. There is actually some indication that pandoc markdown will be extended in the future to support this (updating their internal object model for tables to support column-spanning and these types of things literally only happened last year from what I can see). When they do, we will be able to write a markdown table renderer, but when that happens it will require users to have a cutting edge version of pandoc

On Thu, Feb 25, 2021 at 9:04 AM Dieter Menne notifications@github.com wrote:

It is really strange that cross-referencing and table numbering is not on the working group list. I find it absolutely important to be able to write "As table xx on page yy shows", where both should have a crosslink. Tables (and figures) should be seen in a context, not as standalone creatures, and cross-linking helps to achieve this.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Roche/rtables/issues/168#issuecomment-786054736, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG53MOWYDOIDYY5HRMAEULTAZ7IPANCNFSM4YEY2OZQ .

dmenne commented 3 years ago

Interesting thoughts how you want to uncouple table structure and rendering. I do not agree with you one one point:

The captions/linking/etc dont' have anything to do with the ggplot2 objects,

On some level, they should. And for plots (not only ggplot), the container is the chunk, and you can set the figure caption in the chunk. flextable, for example, also use the chunk concept for containers, not for caption, but for reference labels, so there is some chaos. That's the reason why I maintain the cheatsheet https://gist.github.com/dmenne/f8eb291c9e71a5de44764d442e8bdefd for HTML and further below for docx, because I cannot remember who works with what.

We should be sure that references and captions always stay with the object; for the simplest case, they should stay on the same page (not a problem with HTML, makes live easer). I remember how horribly bad this worked in early versions of Microsoft Word, it is better now. Typographically, some protection against Schusterjungen (nicer than orphans and widows).

Or, when you one day want to move a table with caption into a two-column structure, sometime the caption should stay with its table.

waddella commented 2 years ago

@dmenne I will close this issue. Please try the argument link_label in as_html. If that does not solve the issue then please reopen the issue and add an Rmd example of your desired output (using another framework).