Is there a way to "select" desired columns?

haozhu233 commented 8 years ago

I was trying to eliminate the "statistics" column during the piping using the select function in dplyr. Here is what I got:

  dust(fit) %>%
  sprinkle(col = 2:4, round = 3) %>% 
  sprinkle(col = 5, fn = quote(pvalString(value))) %>% 
  select(-statistic) %>%
  sprinkle_colnames(term = "Term", 
                    estimate = "Estimate", 
                    std.error = "SE",
                    p.value = "P-value")

Error in UseMethod("select_") : 
  no applicable method for 'select_' applied to an object of class "dust"

Clearly, the reason is that the output of dust is not a data.frame so select doesn't know how to deal with this class of table. I'm not exactly sure how pixiedust works internally. Is it possible to add "data.frame" (or even "tbl_df" and "tbl") to the class attributes?

attributes(
     dust(fit) %>%
     sprinkle(col = 2:4, round = 3) %>% 
     sprinkle(col = 5, fn = quote(pvalString(value)))
     )

$names
[1] "head"            "body"            "interfoot"       "foot"            "border_collapse" "longtable"      
[7] "print_method"   

$class
[1] "dust"

I'm using the dev version of pixiedust.

ghost commented 8 years ago

Does eliminating the statistic column before you apply dust() help? Turn the fitted model (fit) into a data.frame first by using broom::tidy().

haozhu233 commented 8 years ago

Cool, thanks! I guess I misunderstood the concept of this package. I'm going to close this issue now.

nutterb commented 8 years ago

I don't think you misunderstood the concept of the package at all. What you're asking for would be quite convenient. You have, however, bumped into a limitation that I'm not sure how I would handle. Consider the following example table where I've included the summary statistics in the table footer (looking at the HTML table will probably make the summary statistics a little more visible):

library(broom)
library(pixiedust)
library(dplyr)

fit <- lm(mpg ~ qsec + wt + factor(am), data = mtcars)

dust(fit, glance_foot = TRUE) %>%
  medley_model()

            term estimate std.error   statistic p.value
1    (Intercept)     9.62      6.96        1.38    0.18
2           qsec     1.23      0.29        4.25 < 0.001
3             wt    -3.92      0.71       -5.51 < 0.001
4    factor(am)1     2.94      1.41        2.08   0.047
5      r.squared     0.85                logLik  -72.06
6  adj.r.squared     0.83                   AIC  154.12
7          sigma     2.46                   BIC  161.45
8      statistic    52.75              deviance  169.29
9        p.value        0           df.residual      28
10            df        4

In the main body of the table (with the coefficients), it's pretty clear which column to remove. In the foot of the table (r-squared and other fit statistics), it isn't quite as clear which column needs to be removed. And if I were to just remove the one under statistic, I'd lose the descriptors of all those fit statistics on the right side.

So while I would like to be able to remove a column with a sprinkle, I don't know that I can do so safely, so to speak. So as @helix123 suggests, I think your safest approach is

tidy(fit) %>%
  select(-statistic) %>%
  dust() %>%
  sprinkle(...) .......

nutterb / pixiedust

Is there a way to "select" desired columns? #34