DavZim / dataverifyr

A Lightweight, Flexible, and Fast Data Validation Package that Can Handle All Sizes of Data
https://davzim.github.io/dataverifyr/
Other
27 stars 1 forks source link

Unknown class of x found when passing a tibble #7

Closed FedericoComoglio closed 1 year ago

FedericoComoglio commented 1 year ago

Hi @DavZim,

I've been testing this package and I encountered an issue when passing a simple tibble object from one of the examples. For instance,

library(dataverifyr)
library(dplyr)

rules <- ruleset(
     rule(x > 0)
)

data <- tibble(x = 1:10)
check_data(data, rules)

throws

Error in detect_type(class(x)) : 
    Unknown class of x found: 'tbl_df,  tbl,  data.frame'. x must be a data.frame/tibble/data.table or a tbl (SQL table) or ArrowObject.

As expected, data is an object of class

> class(data)
[1] "tbl_df"     "tbl"        "data.frame"

The issue is reproducible with the dev version of the package (0.1.6.9001). I tested it on R 4.1.1 running dplyr 1.1.3 and dataverifyr 0.1.6.9001.

I figured you validate an object class in dataverifyr:::detect_type. In this chunk

    else if ("tibble" %in% cc) {
        if (!has_pkg("dplyr")) 
            stop("The dplyr package needs to be installed in order to test a tibble OR you can convert the data to a data.frame first!")
        type <- "dplyr"
    }

you are looking for "tibble" but there is no handling of "tbl_df" and/or "tbl". Since tibbles are such common objects, I was surprised this has not been reported before. Thanks a lot for your help.

DavZim commented 1 year ago

Thanks for pointing this out. This is indeed a bug that should be fixed with the latest commit. Please give it a try by installing the latest github version and let me know if it doesn't work.

I have also exposed the function detect_backend() so that the user can see which backend the package will use. Also I have added tests for these cases here.

FedericoComoglio commented 1 year ago

Thank you, @DavZim - I tested the hotfix by installing the latest devel version and I confirm it works. I made a couple additional observations that could be useful while testing, I'll share them in the coming days.