Closed antondutoit closed 4 years ago
I think this this is tied to your data. You have a column that has the type Period
and
inherits from numeric
. skim()
is dispatching numeric summary functions, but the
values returned by those functions aren't numeric.
Here's how I can reproduce your error:
library(skimr)
mean.period <- function(x, ...) {
res <- NextMethod("mean", x, ...)
structure(res, class = c("Period", "numeric"))
}
my_df <- data.frame(
numeric = 1:3,
period = structure(1:3, class = c("period", "numeric"))
)
skim(my_df)
#> Error: No common type for `..1$by_variable$numeric.mean` <double>
#> and `..2$by_variable$numeric.mean` <Period>.
You can explore the types of columns in your data by calling str()
on it.
To fix this, you need to let skimr known that a different class exists within your
data frame. This approach treats your period
column as numeric, redeploying
the default skimming functions. You might want to call ?skim_with
or
skimr::stats
for more ideas on how to summarize this data.
my_skim <- skim_with(
period = modify_default_skimmers("numeric", new_skim_type = "period")
)
my_skim(my_df)
── Data Summary ────────────────────────
Values
Name my_df
Number of rows 3
Number of columns 2
_______________________
Column type frequency:
numeric 1
period 1
________________________
Group variables None
── Variable type: numeric ──────────────────────────────────────────────────────
skim_variable n_missing complete_rate mean sd p0 p25 p50 p75
1 numeric 0 1 2 1 1 1.5 2 2.5
p100 hist
1 3 ▇▁▇▁▇
── Variable type: period ───────────────────────────────────────────────────────
skim_variable n_missing complete_rate mean sd p0 p25 p50 p75
1 period 0 1 2 1 1 1.5 2 2.5
p100 hist
1 3 ▇▁▇▁▇
I'm getting an error when running skimr, as follows:
Error: No common type for
..1$by_variable$numeric.p0<double> and
..18$by_variable$numeric.p0<Period>.
I'm guessing this has to do with data types, but I am an R user rather than a programmer so I don't know what to do to fix this. My code was previously running fine on a superset of the data which produced this error, so its appearance now is curious. I had added three new variables to the data frame, but using dplyr::select to take them out of the input to skimr did not have any effect. Nor did taking out the one integer variable in the dataset (the rest of the numerics being doubles, obviously).
NB I have updated every package on my R install.
I can't attach original data due to confidentiality (ethics protocol), but if it's required for a solution I will see if I can generate some synthetic data which reproduces the error.
Any help would be much appreciated. Thanks.
last_error and last_trace output below: