Open-Systems-Pharmacology / OSPSuite-R

R package for the OSPSuite
https://www.open-systems-pharmacology.org/OSPSuite-R/
Other
29 stars 12 forks source link

Suspicious warnings #1480

Closed Yuri05 closed 3 months ago

Yuri05 commented 3 months ago

dimensionForUnit <- ospsuite::getDimensionForUnit("year(s)")

When I execute the line of code above (ospsuite package version 12.1.0 from the pre-release), I get the following warnings:

Warning messages:
1: In ospsuite::getDimensionForUnit("year(s)") :
  strings not representable in native encoding will be translated to UTF-8
2: In ospsuite::getDimensionForUnit("year(s)") :
  input string 'µ|μ|µ' cannot be translated to UTF-8, is it valid in 'UTF-8' ?
3: In ospsuite::getDimensionForUnit("year(s)") :
  input string 'µ|μ|µ' cannot be translated to UTF-8, is it valid in 'UTF-8' ?

The returned dimension seems to be correct after that:

> dimensionForUnit
[1] "Age in years"

This happens in R 4.2.2 under Windows (s. the session info below). It does not happen e.g. in R 4.4.0 under Ubuntu. Also the warnings come only once. In order to get them again I have to restart R.

At the end, it's the ospsuite function .encodeUnit() which produces those warnings.

sessionInfo ``` > sessionInfo() R version 4.2.2 (2022-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045) Matrix products: default locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 [4] LC_NUMERIC=C LC_TIME=German_Germany.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] rstudioapi_0.15.0 xml2_1.3.3 magrittr_2.0.3 sysfonts_0.8.8 munsell_0.5.0 [6] tidyselect_1.2.0 colorspace_2.1-0 tlf_1.5.170 R6_2.5.1 rlang_1.1.3 [11] fansi_1.0.4 stringr_1.5.0 showtextdb_3.0 dplyr_1.1.4 tools_4.2.2 [16] grid_4.2.2 data.table_1.14.8 gtable_0.3.1 ospsuite_12.1.0 utf8_1.2.3 [21] cli_3.6.2 tibble_3.2.1 lifecycle_1.0.4 rSharp_1.0.0 purrr_1.0.2 [26] ggplot2_3.5.1 tidyr_1.3.0 vctrs_0.6.5 glue_1.7.0 stringi_1.7.12 [31] compiler_4.2.2 pillar_1.9.0 generics_0.1.3 ospsuite.utils_1.5.35 scales_1.3.0 [36] showtext_0.9-6 jsonlite_1.8.8 pkgconfig_2.0.3 ```
PavelBal commented 3 months ago

Cannot reproduce. Maybe it's because you did not activate UTF-8 support on your machine.

> sessionInfo()
R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)

Matrix products: default

locale:
[1] LC_COLLATE=English_Germany.utf8  LC_CTYPE=English_Germany.utf8    LC_MONETARY=English_Germany.utf8 LC_NUMERIC=C                     LC_TIME=English_Germany.utf8    

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ospsuite_12.1.0.9001 rSharp_1.0.0.9000   

loaded via a namespace (and not attached):
 [1] vctrs_0.6.5          cli_3.6.3            rlang_1.1.4          stringi_1.8.4        showtextdb_3.0       sysfonts_0.8.9       purrr_1.0.2          generics_0.1.3      
 [9] tlf_1.5.0            jsonlite_1.8.8       data.table_1.15.4    glue_1.7.0           colorspace_2.1-1     scales_1.3.0         fansi_1.0.6          grid_4.4.1          
[17] munsell_0.5.1        tibble_3.2.1         lifecycle_1.0.4      stringr_1.5.1        compiler_4.4.1       dplyr_1.1.4          pkgconfig_2.0.3      tidyr_1.3.1         
[25] rstudioapi_0.16.0    ospsuite.utils_1.5.0 R6_2.5.1             tidyselect_1.2.1     utf8_1.2.4           showtext_0.9-7       pillar_1.9.0         magrittr_2.0.3

Not sure what to do.

Yuri05 commented 3 months ago

Maybe it's because you did not activate UTF-8 support on your machine.

Maybe... How can I activate it?

> sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252    
Felixmil commented 3 months ago

How can I activate it?

usethis::edit_r_profile()
# Paste following line in file
# invisible(Sys.setlocale("LC_ALL","en_US.UTF-8"))
# Save file and restart session
Sys.getlocale()
Felixmil commented 3 months ago

I can reproduce the issue on my side with:

Sys.setlocale(locale="German_Germany.1252")
#> Warning in Sys.setlocale(locale = "German_Germany.1252"): using locale code
#> page other than 65001 ("UTF-8") may cause problems
#> [1] "LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252"

ospsuite::getDimensionForUnit("year(s)")
#> Warning in ospsuite::getDimensionForUnit("year(s)"): strings not representable
#> in native encoding will be translated to UTF-8
#> Warning in ospsuite::getDimensionForUnit("year(s)"): input string 'µ|µ|<b5>'
#> cannot be translated to UTF-8, is it valid in 'UTF-8'?
#> Warning in ospsuite::getDimensionForUnit("year(s)"): input string 'µ|µ|<b5>'
#> cannot be translated to UTF-8, is it valid in 'UTF-8'?
#> [1] "Age in years"

Created on 2024-08-19 with reprex v2.1.1

If I keep German but utf-8:


Sys.setlocale(locale="German_Germany.utf-8")
#> [1] "LC_COLLATE=German_Germany.utf8;LC_CTYPE=German_Germany.utf8;LC_MONETARY=German_Germany.utf8;LC_NUMERIC=C;LC_TIME=German_Germany.utf8"

ospsuite::getDimensionForUnit("year(s)")
#> [1] "Age in years"

Created on 2024-08-19 with reprex v2.1.1

Finally with EN_US:

Sys.setlocale(locale="EN_US.utf-8")
#> [1] "LC_COLLATE=EN_US.utf-8;LC_CTYPE=EN_US.utf-8;LC_MONETARY=EN_US.utf-8;LC_NUMERIC=C;LC_TIME=EN_US.utf-8"

ospsuite::getDimensionForUnit("year(s)")
#> [1] "Age in years"

Created on 2024-08-19 with reprex v2.1.1

Yuri05 commented 3 months ago

@Felixmil Nice, seems to fix the problem! Closing.

Yuri05 commented 3 months ago

After changing my locale I get reproducible crashes in RE on Windows in the sensitivity plots. Tried it first with "German_Germany.utf-8" - crashed 2 times. Then deactivated it (switched back to my default locale "German_Germany.1252") - no crash. Changed to "en_US.UTF-8" - crash again.

Crash screenshot ![grafik](https://github.com/user-attachments/assets/79d9792f-eb80-4c04-9a56-cd027969beaf)

When changing to one of the UTF-8 locales, I also get the warning during the start of R:

Warning message:
In Sys.setlocale("LC_ALL", "en_US.UTF-8") :
  using locale code page other than 1252 may cause problems

@PavelBal @Felixmil Any thoughts on this?

Felixmil commented 3 months ago

Maybe because there is a mismatch between this locale and the one defined in your windows configuration ? Personnaly, I have this: image

Yuri05 commented 3 months ago

Well, I have German language settings

Settings ![grafik](https://github.com/user-attachments/assets/a4772ed4-8f9c-41db-98e5-279bfec7695b)

In any case, I would expect at most some kind of error/warning message, but not the full R session crash.

Felixmil commented 3 months ago

Could you share a reproducible example ? I will try to change my locale and run it

Yuri05 commented 3 months ago

@Felixmil Here is a very minimalistic example: MiniModel.zip. Reproducibly crashes after few seconds of running if I set my locale to "en_US.UTF-8". No problem with the default locale.

You will need the reporting engine for running it: https://ci.appveyor.com/project/open-systems-pharmacology-ci/OSPSuite-ReportingEngine/branch/develop/artifacts

Felixmil commented 3 months ago

I tried to run the Workflow.R in the following locale:

All of them worked without any crash.

Did you try to tick this "Beta" box in the advanced language settings ? Or change language there ?

image

Yuri05 commented 3 months ago

@Felixmil Indeed, activating the "Use Unicode UTF-8 for worldwide..." seems to solve the problem - I don't get crashes anymore with both "en_US.UTF-8" and "German_Germany.1252" locales.

Hopefully it will not damage other programs :)

Felixmil commented 3 months ago

We also have to keep in mind that we cannot expect users to activate this settings or change their locale to make our packages work on their system. So, even if we found a fix here, I would still keep track of this and investigate.

Yuri05 commented 3 months ago

So, even if we found a fix here, I would still keep track of this and investigate.

Fully agree