Add a View Data tab into the Insert Dialog

rdstern commented 9 months ago

@Patowhiz there was a good discussion in #8774, which is now closed. Patrick said the following:

"Upon further reflection on your suggestion, I'm contemplating the possibility of introducing a 'View Data' tab. This tab could generate scripts for viewing data frames, columns, and output objects without necessitating their addition to the data book. The generated scripts would invoke the appropriate output system corresponding to the selected option. This approach might provide a more cohesive solution to this and other current challenges. If you agree, I'll recommend this option."

I suggest this would be an excellent addition. I am already making a distinction between runninginR-Instat and running with R-Instat.

So: "If you are in then your data is in an R-Instat data book. Then you have the spreadsheet view of the data and the meta-data. The powerfulo calculation system is also available for data summaries, etc. (We plan for R-packages, so these could also be used in RStudio), but that is not there yet - except for a package on the extensions we have made for missing values, when summarising data."

If you are with, then you are limited to the log/script and the output windows. You probably work more like this:

Rather than:

If you use with a lot, then you will find R-Instat is inconvenient and limited compared to RStudio. Just one of many examples is the typing ahead feature when using RStudio. More generally if you are writing R scripts, then please use RStudio.

But if your use of scripts is often linked to the ordinary use of R-Instat and you occasionally wish to use scripts "with", then it is easy to do. Here is where we explain!

That's my lead in to your new tab. Is that work for you, or can you specify it so other can do? If it is going to take a long time, then it is sufficiently important (but not urgent) that I would propose we add the tab now, with a text in the tab that explains that thisd is coming soon!

Patowhiz commented 9 months ago

@Vitalis95 will you be able to work on this?

Vitalis95 commented 8 months ago

@rdstern , What exact command(s) to be added on this new tab? We can have a call

rdstern commented 8 months ago

@lloyddewit and @Patowhiz and @Vitalis95 I include all here to report on a discovery - at least to me, that seems important. a) I suggest the new tab be called simply View - rather than View Data, because I am particularly interested also in viewing graphs. b) Now my discovery - which you probably all knew about - but it changes a lot for me. When running a script, the graphs automatically display just in the R viewer. And the R viewer can only display a single graph. So, currently, if we have multiple graphs, in a script, then we need to run in sections and save each graph, because it is overwritten by the next one.

My discovery is that this isn't true! And that changes quite a lot for me.
The 2 useful commands are windows() and (possibly) dev.cur, and maybe other dev commands. Once I run windows() it opens the graph viewer. And in the History menu I can click to turn on recording. Then it records multiple graphs.

If I type windows(recording=TRUE), in the script window, then it turns recording on automatically. Then I can press Run All and look back at all the multiple graphs I generated.

I hope you agree that this changes quite a lot in our scope for running scripts. I like the R graph viewer, which allows the graph window to be resized and also gives various options for saving the graph, etc. The big down-size interactively was that it crashes R-Instat when using the mouse wheel, while it is drawing a graph. But that's much less of a problem if the graphs are coming from a script. (It might still be a problem when they are resized?)

So I assume that windows(), with the default being windows(recording=TRUE) is a strong candidate for the View tab.

Maybe there is a small block with label View? 1) O Graph Window <checkbox, default is checked) Multiple Graphs 2) O Data Window

I would still also like the commands to give the option to put the graphs into the Output window. @Patowhiz can you specify those.

What else is useful here?

Patowhiz commented 8 months ago

@rdstern for output window, the options are already specified in the Get Data tab.

rdstern commented 8 months ago

@Vitalis95 the comment above by @Patowhiz is for something different. But for now the only urgent addition is this one:. I have also moved the following to a new issue #8844 , because my next comment (below) is on the initial topic here of adding a View tab.

O Graph Window <checkbox, default is checked) Multiple Graphs This has comment line Open the R graph viewer. If the checkbox is checked, then add to the comment "recording multiple graphs" The default is windows( ). With the checkbox it adds (record = TRUE)

For now I suggest it just be added to the Commands tab. We'll consider adding a new View tab later. I'll explain on skype.

And a small change at the same time. Could you change the name Examples on that tab, to Library.

And another small change, at the same time, please @Vitalis95 . Could you please add the Help ID, of 180 to the Insert dialog.

rdstern commented 8 months ago

@Patowhiz we must be thinking differently then about the possible new View tab. You said "for output window, the options are already specified in the Get Data tab."

The Get data tab is designed to get data from an R-Instat data book. I am thinking of scripts that might not even use a data book at all. The results for text for these scripts go nicely into the output window.

But html output goes just to a browser and graphs go only to the R viewer.

Below I give examples of html output in a browser. I got a bit carried away, because they are severe tests of the scripts, which has come through with flying colours!

Then most of the scripts with packages produce multiple graphs. Could we also add the commands to include them in the ordianry output window?

Here is an example from the gt package - nanoplots - which I like (and hope can soon come from table options your sub-dialog?)

# Load library "gt"
library(package="gt")
md <- gt::md

illness |>
  dplyr::slice_head(n = 10) |>
  gt(rowname_col = "test") |>
  tab_header("Partial summary of daily tests performed on YF patient") |>
  tab_stubhead(label = md("**Test**")) |>
  cols_hide(columns = c(starts_with("norm"), starts_with("day"))) |>
  fmt_units(columns = units) |>
  cols_nanoplot(
    columns = starts_with("day"),
    new_col_name = "nanoplots",
    new_col_label = md("*Progression*")
  ) |>
  cols_align(align = "center", columns = nanoplots) |>
  cols_merge(columns = c(test, units), pattern = "{1} ({2})") |>
  tab_footnote(
    footnote = "Measurements from Day 3 through to Day 8.",
    locations = cells_column_labels(columns = nanoplots)
  )

While here I'm testing the rest. Here is the second:

pizzaplace |>
  dplyr::select(type, date) |>
  dplyr::group_by(date, type) |>
  dplyr::summarize(sold = dplyr::n(), .groups = "drop") |>
  tidyr::pivot_wider(names_from = type, values_from = sold) |>
  dplyr::slice_head(n = 10) |>
  gt(rowname_col = "date") |>
  tab_header(
    title = md("First Ten Days of Pizza Sales in 2015")
  ) |>
  cols_nanoplot(
    columns = c(chicken, classic, supreme, veggie),
    plot_type = "bar",
    new_col_name = "pizzas_sold",
    new_col_label = "Sales by Type",
    options = nanoplot_options(
      show_data_line = FALSE,
      show_data_area = FALSE,
      data_bar_stroke_color = "transparent",
      data_bar_fill_color = c("brown", "gold", "purple", "green")
    )
  ) |>
  cols_width(pizzas_sold ~ px(150)) |>
  cols_align(columns = -date, align = "center") |>
  fmt_date(columns = date, date_style = "yMMMEd") |>
  opt_all_caps()

Giving:

Then:

towny |>
  dplyr::select(name, starts_with("population"), starts_with("density")) |>
  dplyr::filter(population_2021 > 200000) |>
  dplyr::arrange(desc(population_2021)) |>
  gt() |>
  fmt_integer(columns = starts_with("population")) |>
  fmt_number(columns = starts_with("density"), decimals = 1) |>
  cols_nanoplot(
    columns = starts_with("population"),
    reference_line = "median",
    new_col_name = "population_plot",
    new_col_label = md("*Change*")
  ) |>
  cols_nanoplot(
    columns = starts_with("density"),
    plot_type = "bar",
    new_col_name = "density_plot",
    new_col_label = md("*Change*")
  ) |>
  cols_hide(columns = matches("2001|2006|2011|2016")) |>
  tab_spanner(
    label = "Population",
    columns = starts_with("population")
  ) |>
  tab_spanner(
    label = "Density ({{*persons* km^-2}})",
    columns = starts_with("density")
  ) |>
  cols_label_with(
    columns = -matches("plot"),
    fn = function(x) gsub("\\D+", "", x)
  ) |>
  cols_align(align = "center", columns = matches("plot")) |>
  cols_width(
    name ~ px(140),
    everything() ~ px(100)
  ) |>
  opt_horizontal_padding(scale = 2)

Gives:

And:

sza |>
  dplyr::filter(latitude == 20 & tst <= "1200") |>
  dplyr::select(-latitude) |>
  dplyr::filter(!is.na(sza)) |>
  dplyr::mutate(saa = 90 - sza) |>
  dplyr::select(-sza) |>
  tidyr::pivot_wider(
    names_from = tst,
    values_from = saa,
    names_sort = TRUE
  ) |>
  gt(rowname_col = "month") |>
  tab_header(
    title = "Solar Altitude Angles",
    subtitle = "Average values every half hour from 05:30 to 12:00"
  ) |>
  cols_nanoplot(
    columns = matches("0"),
    plot_type = "bar",
    missing_vals = "zero",
    new_col_name = "saa",
    plot_height = "2.5em",
    options = nanoplot_options(
      data_bar_stroke_color = "GoldenRod",
      data_bar_fill_color = "DarkOrange"
    )
  ) |>
  cols_hide(columns = matches("0")) |>
  tab_options(
    table.width = px(400),
    column_labels.hidden = TRUE
  ) |>
  cols_align(
    align = "center",
    columns = everything()
  ) |>
  tab_source_note(
    source_note = "The solar altitude angle is the complement to
    the solar zenith angle. TMYK."
  )

Gives:

Finally:

pizzaplace |>
  dplyr::filter(date == "2015-01-01") |>
  dplyr::mutate(date_time = paste(date, time)) |>
  dplyr::select(type, date_time, price) |>
  dplyr::group_by(type) |>
  dplyr::summarize(
    date_time = paste(date_time, collapse = ","),
    sold = paste(price, collapse = ",")
  ) |>
  gt(rowname_col = "type") |>
  tab_header(
    title = md("Pizzas sold on **January 1, 2015**"),
    subtitle = "Between the opening hours of 11:30 to 22:30"
  ) |>
  cols_hide(columns = c(date_time, sold)) |>
  cols_nanoplot(
    columns = sold,
    columns_x_vals = date_time,
    expand_x = c("2015-01-01 11:30", "2015-01-01 22:30"),
    reference_line = "median",
    new_col_name = "pizzas_sold",
    new_col_label = "Pizzas Sold",
    options = nanoplot_options(
      show_data_line = FALSE,
      show_data_area = FALSE,
      currency = "USD"
    )
  ) |>
  cols_width(pizzas_sold ~ px(200)) |>
  cols_align(columns = pizzas_sold, align = "center") |>
  opt_all_caps()

Gives:

Patowhiz commented 8 months ago

@rdstern the get data object function uses a standalone view data object function under the hood. That's why I told @Vitalis95 to check it. Using the view function doesn't require the data book and therefore can be used to render the objects in the databook directly.

To use the view data function for viewing in the output window, you have to give it the correct parameters, it's not designed to guess how to render the passed object without specifying things like the format. RStudio somehow does this internally, my previous research shows that their environment exposes a 'viewer' which I think R developers use(I'm not strongly sure). They probably have a tech stack that allows them to do this, our tech stack somehow limits us. I was to do more research on this at some point.

IDEMSInternational / R-Instat

Add a View Data tab into the Insert Dialog #8798

Then most of the scripts with packages produce multiple graphs. Could we also add the commands to include them in the ordianry output window?