gesistsa / minty

🌿MINimal TYpe guesser
https://gesistsa.github.io/minty/
Other
5 stars 0 forks source link

Make `minus` work for `type_convert()` #20

Closed chainsawriot closed 6 months ago

chainsawriot commented 6 months ago

https://github.com/tidyverse/readr/issues/1509

chainsawriot commented 6 months ago

But at the moment, we should signal that it is not supported

https://github.com/chainsawriot/minty/blob/912bb3609aab1b894cea97dcb9533d0753b5ad51/R/parser.R#L646

chainsawriot commented 6 months ago

It errs only in the case of using shorthand.

text_only <- as.data.frame(sapply(head(PlantGrowth), as.character))

readr::type_convert(text_only, col_types = list(weight = "?", group = "-"))
#>   weight
#> 1   4.17
#> 2   5.58
#> 3   5.18
#> 4   6.11
#> 5   4.50
#> 6   4.61

readr::type_convert(text_only, col_types = readr::cols(weight = readr::col_guess(), group = readr::col_skip()))
#>   weight
#> 1   4.17
#> 2   5.58
#> 3   5.18
#> 4   6.11
#> 5   4.50
#> 6   4.61

readr::type_convert(text_only, col_types = "?-")
#> Warning: Insufficient `col_types`. Guessing 1 columns.
#> Error in if (is.na(name)) {: argument is of length zero

minty::type_convert(text_only, col_types = list(weight = "?", group = "-"))
#>   weight
#> 1   4.17
#> 2   5.58
#> 3   5.18
#> 4   6.11
#> 5   4.50
#> 6   4.61

minty::type_convert(text_only, col_types = readr::cols(weight = readr::col_guess(), group = readr::col_skip()))
#>   weight
#> 1   4.17
#> 2   5.58
#> 3   5.18
#> 4   6.11
#> 5   4.50
#> 6   4.61

minty::type_convert(text_only, col_types = "?-")
#> Warning: Insufficient `col_types`. Guessing 1 columns.
#> Error in if (is.na(name)) {: argument is of length zero

Created on 2024-03-18 with reprex v2.1.0

chainsawriot commented 6 months ago

col_spec_standardise screams refactoring.

chainsawriot commented 6 months ago

This bunch of code doesn't look useful for minty because col_names are generated from type_convert and it doesn't look like there will be cases with either NA or duplicated column names.

https://github.com/chainsawriot/minty/blob/9470b74c6a7df9b7a61052a8e4a79c3d4ef38bd6/R/parser.R#L920-L952

chainsawriot commented 6 months ago

col_spec_standardise shouldn't check for the missing column names (or additional column names) for our usage, because type_convert has already blocked those cases.

The check for missing column names is the reason for "?-" not working. And "?-" is sufficient in the following case.

text_only <- as.data.frame(sapply(head(PlantGrowth), as.character))

## Wont work already
minty::type_convert(text_only, col_types = "?")
#> Error: `df` and `col_types` must have consistent lengths:
#>   * `df` has length 2
#>   * `col_types` has length 1
minty::type_convert(text_only, col_types = "?-?")
#> Error: `df` and `col_types` must have consistent lengths:
#>   * `df` has length 2
#>   * `col_types` has length 3

minty::type_convert(text_only, col_types = "?-")
#> Warning: Insufficient `col_types`. Guessing 1 columns.
#> Error in if (is.na(name)) {: argument is of length zero
minty::type_convert(text_only, col_types = "--")
#> Warning: Insufficient `col_types`. Guessing 2 columns.
#> Error in if (is.na(name)) {: argument is of length zero

## also not working
minty::type_convert(text_only, col_types = list("?", "-"))
#> Warning: Insufficient `col_types`. Guessing 1 columns.
#> Error in if (is.na(name)) {: argument is of length zero

minty::type_convert(text_only, col_types = readr::cols(weight = readr::col_skip(), group = readr::col_skip(), aaa = readr::col_skip()))
#> data frame with 0 columns and 6 rows

Created on 2024-03-19 with reprex v2.1.0