Tazinho / snakecase

🐍🐍🐍 A systematic approach to parse strings and automate the conversion to snake_case, UpperCamelCase or any other case.
https://tazinho.github.io/snakecase/
GNU General Public License v3.0
147 stars 9 forks source link

snakecase::to_any_case() causes warnings on 1st run due to UTF-8 characters #191

Open cjyetman opened 3 years ago

cjyetman commented 3 years ago

Honestly, I can only reliably replicate this in one specific environment (2dii/r-packages), but it seems to be related to how UTF-8 characters are defined in replace_special_characters_internal.R.

snakecase::to_any_case("\u00E4ngstlicher Has\u00EA", transliterations = c("german", "Latin-ASCII"))
warnings()
sessionInfo()
R version 4.0.3 (2020-10-10) -- "Bunny-Wunnies Freak Out" Copyright (C) 2020 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > snakecase::to_any_case("\u00E4ngstlicher Has\u00EA", transliterations = c("german", "Latin-ASCII")) [1] "aengstlicher_hase" There were 13 warnings (use warnings() to see them) > warnings() Warning messages: 1: In FUN(X[[i]], ...) : unable to translate '' to native encoding 2: In FUN(X[[i]], ...) : unable to translate '' to native encoding 3: In FUN(X[[i]], ...) : unable to translate '' to native encoding 4: In FUN(X[[i]], ...) : unable to translate '' to native encoding 5: In FUN(X[[i]], ...) : unable to translate '' to native encoding 6: In FUN(X[[i]], ...) : unable to translate '' to native encoding 7: In FUN(X[[i]], ...) : unable to translate '' to native encoding 8: In FUN(X[[i]], ...) : unable to translate '' to native encoding 9: In FUN(X[[i]], ...) : unable to translate '' to native encoding 10: In FUN(X[[i]], ...) : unable to translate '' to native encoding 11: In FUN(X[[i]], ...) : unable to translate '' to native encoding 12: In FUN(X[[i]], ...) : unable to translate '' to native encoding 13: In FUN(X[[i]], ...) : unable to translate '' to native encoding > sessionInfo() R version 4.0.3 (2020-10-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.5 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1 locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.0.3 magrittr_1.5 snakecase_0.11.0 tools_4.0.3 [5] stringi_1.5.3 stringr_1.4.0

I can also replicate getting these warnings with...

snakecase:::replace_special_characters_internal

or...

get("replace_special_characters_internal", envir = asNamespace("snakecase"), inherits = FALSE)

always only on the first time it's run in a session.

The warnings appear when snakecase:::replace_special_characters_internal() is loaded for the first time. I ended up here because janitor::make_clean_names() calls snakecase:::replace_special_characters_internal(), triggering these warnings.

Maybe, for instance, intToUtf8(220) would be safer than "\u00C4"? 🤷🏻 any idea what's causing this?

cjyetman commented 3 years ago

something similar is happening on CRAN's windows build: https://www.r-project.org/nosvn/R.check/r-release-windows-ix86+x86_64/snakecase-00check.html