vincentarelbundock / countrycode

R package: Convert country names and country codes. Assigns region descriptors.
https://vincentarelbundock.github.io/countrycode
GNU General Public License v3.0
346 stars 84 forks source link

add 'tbl_df' and 'tbl' classes to guess_field result #247

Closed cjyetman closed 3 years ago

cjyetman commented 4 years ago

Without adding any dependencies, and without any negative consequences that I can think of, we could add 'tbl_df' and 'tbl' classes to the result of guess_field so that if a user has tibble loaded, the user would see the result printed with nice tibble-style formatting, but otherwise it would behave exactly like a base-style data.frame. Something like...

class(result) <- c('tbl_df', 'tbl', 'data.frame')
return(result)

at the end of the guess_field function.

vincentarelbundock commented 4 years ago

I don't know much about tibbles, but as_tibble is a pretty complicated function which "repairs" names and does some other stuff:

https://github.com/tidyverse/tibble/blob/master/R/as_tibble.R

I worry that if we create "fake" tibbles that don't have all the proper structure and features, that some packages might break on interaction with our output.

cjyetman commented 4 years ago

Yeah, like it would force tibble's versions of [ and [[ on that object... but I figure it's extremely unlikely that anyone would use the output from guess_field for anything other than quickly deciding what to plug in to countrycode.

The benefit I was trying to achieve was to not fill the console with hundreds of lines of output if many of the cldr codes matched (or otherwise), which happened to me a few times when I was experimenting with it. Maybe a better solution would be to have a default max_results = 10 type argument?

vincentarelbundock commented 4 years ago

Yeah, that makes sense.

If we set a max result, maybe it would make sense to print a warning when we are omitting rows where the level of similarity is identitical to those that are shown. If the vector is short, a lot of the codes will be 100% matches, so the top-10 is somewhat arbitrary in that context.

cjyetman commented 4 years ago

100% agree