This PR adds support for collector-level na args (#532). This way, different lists of missing values can be specified for each column, overriding the global na arg in the call to vroom().
Example:
vroom(
I("a,b,c\na,foo,REFUSED\nb,REFUSED,MISSING\nOMITTED,bar,OMITTED\n"),
col_types = cols(
a = col_character(na = "OMITTED"),
b = col_character(na = "REFUSED"),
c = col_character()
),
na = "MISSING"
)
#> # A tibble: 3 × 3
#> a b c
#> <chr> <chr> <chr>
#> 1 a foo REFUSED
#> 2 b NA NA
#> 3 NA bar OMITTED
Without this PR, it is very difficult to efficiently read columns with different lists of missing values. Instead, they have to be loaded as character vectors, then parsed with readr::parse_*() or readr::type_convert(). There are two problems with this:
parsing a chr vector after loading with vroom forces the vector to materialize, defeating vroom's lazy-loading altrep goodness
I'm hoping you'll consider this PR for inclusion to vroom – it only requires a few changes, is 100% backwards compatible, and adds a feature that cannot otherwise be implemented in a separate package (without duplicating all of vroom's internals). Please let me know if there is anything more I can do to advocate for it. Thank you for your consideration!
Note that this is failing the check for windows-latest (3.6) because the runner is grabbing the latest version of evaluate, which now requires R >= 4.0.0.
This PR adds support for collector-level
na
args (#532). This way, different lists of missing values can be specified for each column, overriding the globalna
arg in the call tovroom()
.Example:
Without this PR, it is very difficult to efficiently read columns with different lists of missing values. Instead, they have to be loaded as character vectors, then parsed with
readr::parse_*()
orreadr::type_convert()
. There are two problems with this:I'm hoping you'll consider this PR for inclusion to vroom – it only requires a few changes, is 100% backwards compatible, and adds a feature that cannot otherwise be implemented in a separate package (without duplicating all of vroom's internals). Please let me know if there is anything more I can do to advocate for it. Thank you for your consideration!