gagolews / stringi

Fast and portable character string processing in R (with the Unicode ICU)
https://stringi.gagolewski.com/
Other
304 stars 44 forks source link

stri_locale_list() showing character(0) #412

Closed tklejmont closed 3 years ago

tklejmont commented 3 years ago

Hi there,

I noticed that stri_locale_list() is not displaying any codes when using stringi 1.4.6 that came preinstalled with R 4.0.2. If I install and load stringi 1.5.3 in that same environment, it works fine. Could this be a compilation issue during the R install or perhaps something else is causing this behavior?

Here's the output of using the preinstalled 1.4.6 vs loading 1.5.3 and running the same:

library(stringi, lib.loc = "/opt/R/4.0.2/lib/R/library") stringi::stri_locale_list() character(0) detach("package:stringi", unload = TRUE) library(stringi) stringi::stri_locale_list() [1] "af" "af_NA" "af_ZA" "agq" "agq_CM" "ak" "ak_GH" "am" "am_ET" "ar" "ar_001"

Please let me know if I can provide any additional information.

Thank you

gagolews commented 3 years ago

Just guessing: It might be due to your system ICU being shipped with a stub data file.?

What's the output of stringi::stri_info() ? What operating system are you on?

tklejmont commented 3 years ago

Running on RHEL 7.

Here's the output of stringi::stri_info()

$Unicode.version [1] "8.0"

$ICU.version [1] "57.1"

$Locale $Locale$Language [1] "en"

$Locale$Country [1] "US"

$Locale$Variant [1] ""

$Locale$Name [1] "en_US"

$Charset.internal [1] "UTF-8" "UTF-16"

$Charset.native $Charset.native$Name.friendly [1] "UTF-8"

$Charset.native$Name.ICU [1] "UTF-8"

$Charset.native$Name.UTR22 [1] NA

$Charset.native$Name.IBM [1] "ibm-1208"

$Charset.native$Name.WINDOWS [1] "windows-65001"

$Charset.native$Name.JAVA [1] "UTF-8"

$Charset.native$Name.IANA [1] "UTF-8"

$Charset.native$Name.MIME [1] "UTF-8"

$Charset.native$ASCII.subset [1] TRUE

$Charset.native$Unicode.1to1 [1] NA

$Charset.native$CharSize.8bit [1] FALSE

$Charset.native$CharSize.min [1] 1

$Charset.native$CharSize.max [1] 3

$ICU.system [1] TRUE

$ICU.UTF8 [1] FALSE

Warning message: In stringi::stri_info() : Your current locale is not in the list of available locales. Some functions may not work properly. Refer to stri_locale_list() for more details on known locale specifiers.

gagolews commented 3 years ago

Yes, it is your system ICU that is not shipped with icudata. Try installing the libicu-devel rpm maybe?

tklejmont commented 3 years ago

Our system has the libicu-devel library installed and as I mentioned before, if I install stringi 1.5.3 it works fine in that same environment. So I suppose this must've been a compilation issue with the package during R installation but wouldn't it error out at that time? And I'm not familiar with how the packages are compiled during a new R build, if they come as a binary or are compiled during the installation, but it does appear that our system has all the required libraries to compile stringi correctly.

yum list installed | grep libicu-devel libicu-devel.x86_64 57.1-1 @icu-mirror

gagolews commented 3 years ago

(closing as inactive for > 1 year)