ropensci / skimr

A frictionless, pipeable approach to dealing with summary statistics
https://docs.ropensci.org/skimr
1.1k stars 79 forks source link

Html output in the R viewer (i.e. Better integration with the gt package) #667

Open michaelquinn32 opened 3 years ago

michaelquinn32 commented 3 years ago

The gtsummary package shows some great ideas for producing nice tables with gt. http://www.danieldsjoberg.com/gtsummary/

It seems like skimr should also be able to support gt methods that produce nice html tables. This should be similar to our support of other approaches for doc-ready tables.

elinw commented 2 years ago

YEs totally agree with this. gt is really a nice model.

michaelquinn32 commented 2 years ago

We're pretty close already with this sort of code snippet:

skim(iris) %>%
  partition() %>%
  purrr::map(gt::gt) %>%
  htmltools::tagList() %>%
  htmltools::browsable()

Four wants:

  1. Make that output above nicer, incorporating the skim summary, titles for each type and appropriate spacing
  2. Decide if an appropriate "viewer" is available to use (likely from getOption("viewer")) for html output
  3. Dispatch appropriate histogram function based on viewer availability. KableExtra can create nice html-friendly sparkgraphs: https://cran.r-project.org/web/packages/kableExtra/vignettes/awesome_table_in_html.html#Insert_Images_into_Columns
  4. Make this the default output when html formatting is available in a notebook: https://colab.research.google.com/drive/1KgT0LGBB7r_qZ3nR5hPWTq8nvSAwzs6P?usp=sharing
michaelquinn32 commented 2 years ago

More iteration on this idea, but I think I've got it now.

I think this is really getting us towards skimr 3.0 territory. It's really exciting.

We need to create a new option: whether or not someone want to use rich html output (and make sure this aligns with jupyter.rich_display). I looked a bit more, and step 2. above really isn't necessary. We can use the browsable() function from htmltools to handle viewing. It worked for me in rstudio and vscode. We only need to turn browsable() on if interactive().

Next we need to introduce a new concept to skimr: render_* functions. As of now, I see render_text (replacing the default print options), render_markdown (replacing the knit print functions) and the new render_html functions. All take a skim_df and two functions: one for rendering the summary and the other for rendering the individual statistics tables. With these, the user will get much more granular options for customizing output. I think this is really important for the html varieties, since there are so many options within gt.

These functions will be generic. You should be able to call them on a complete skim_df, a skim_list (what you get from partition) and one_skim_df (the contents from partition).

Otherwise, the last bit of this will be to clean up the summary. Right now, the summary generates the object and does some of the printing in the same step. We'll need to handle that in separate functions, so that we'll have more options for summary printing.

michaelquinn32 commented 2 years ago

Another extension here is to produce the results as an html widget. Then, for example, we could put each table (summary + stats for each type) within a tabset. https://dev.to/imiahazel/pure-html-css-tabs-2p60

Supporting that means our option for output type shouldn't just be boolean, but support multiple text values. Closer to an enum.

elinw commented 2 years ago

This sounds amazing. Maybe we should be declaring a skimr 3 roadmap?

Elin On Thu, Dec 30, 2021 at 1:24 AM Michael Quinn @.***> wrote:

Another extension here is to produce the results as an html widget. Then, for example, we could put each table (summary + stats for each type) within a tabset. https://dev.to/imiahazel/pure-html-css-tabs-2p60

Supporting that means our option for output type shouldn't just be boolean, but support multiple text values. Closer to an enum.

— Reply to this email directly, view it on GitHub https://github.com/ropensci/skimr/issues/667#issuecomment-1002890424, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFYI7KFQGQAN5ODBPEOL33UTP3JPANCNFSM5AW4JKZA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

olivroy commented 1 year ago

@michaelquinn32

I have a suggestion for 1.

gt_captioned <- function(x, name) gt::gt(x, caption = name)

skim(iris) %>%
   partition() %>%
   purrr::imap(gt_captioned) %>%
   htmltools::tagList() %>%
   htmltools::browsable()

I'd be interested to see more!

On my side, I did further modifications to improve the display, such as pct formatting, and auto formatting on gt side. It is out of scope for skimr though..

        gt_skim <- function(data, type) {
          gt::gt(data, caption = type) |>
            gt::fmt_auto(lg_num_pref = "suf") |>
            gt::fmt_percent(
              columns = any_of("complete_rate"),
              drop_trailing_zeros = TRUE, 
             decimals = 1
            )
        }