Closed chleeb closed 2 years ago
Seems to be an issue with the Encoding()
assignement function specifically. Can't check if it's windows specific at the moment.
Using the enc2utf8()
function or specifying the encoding directly in read.delim()
, the –
displays properly.
So I'll fix the function to do that instead.
In the mean time, to keep it in UTF-8 and have it display properly, you can change the encoding to "latin1" (or even just "") and back to UTF-8 with enc2utf8(
Encoding<-(requested_bold_records$copyright_licenses, ""))
I'm seeing a related error:
> test <- bold_seqspec('Thecostraca')
Error in type.convert.default(data[[i]], as.is = as.is[i], dec = dec, :
invalid multibyte string at '<a0>Oct<6f>meris angulosa'
Session info:
R version 4.1.3 (2022-03-10)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.3.1
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dplyr_1.0.8 taxize_0.9.99 bold_1.2.0
loaded via a namespace (and not attached):
[1] zoo_1.8-9 tidyselect_1.1.2 xfun_0.30 purrr_0.3.4 lattice_0.20-45 vctrs_0.3.8 generics_0.1.2 htmltools_0.5.2 yaml_2.3.5
[10] utf8_1.2.2 rlang_1.0.2 pillar_1.7.0 httpcode_0.3.0 glue_1.6.2 DBI_1.1.2 uuid_1.0-3 foreach_1.5.2 lifecycle_1.0.1
[19] plyr_1.8.6 stringr_1.4.0 codetools_0.2-18 evaluate_0.15 knitr_1.37 fastmap_1.1.0 parallel_4.1.3 curl_4.3.2 fansi_1.0.2
[28] triebeard_0.3.0 urltools_1.7.3 Rcpp_1.0.8 jsonlite_1.8.0 digest_0.6.29 stringi_1.7.6 grid_4.1.3 cli_3.2.0 tools_4.1.3
[37] magrittr_2.0.2 tibble_3.1.6 crul_1.2.0 crayon_1.5.0 ape_5.6-2 pkgconfig_2.0.3 ellipsis_0.3.2 data.table_1.14.2 xml2_1.3.3
[46] assertthat_0.2.1 rmarkdown_2.13 reshape_0.8.9 rstudioapi_0.13 iterators_1.0.14 R6_2.5.1 conditionz_0.1.0 nlme_3.1-155 compiler_4.1.3
Changed the encoding function, it should work now. Let me know if it didn't fix it.
I recently updated R, RStudio and all my R packages. Since then I observed a bug in
bold
which might be linked to the UTF-8 encoding. When I runrequested_bold_records$copyright_licenses
isCreativeCommons \u0096 Attribution Share-Alike (by-sa) 818
. Before updating everything it wasCreativeCommons – Attribution Share-Alike (by-sa) 818
. As\u0096
is–
it might be a problem with the UTF-8 encoding.Any ideas what's going on and how to fix it? Thanks!
Session Info
```r R version 4.1.3 (2022-03-10) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19042) Matrix products: default Random number generation: RNG: Mersenne-Twister Normal: Inversion Sample: Rounding locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 [4] LC_NUMERIC=C LC_TIME=German_Germany.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] bold_1.2.0 loaded via a namespace (and not attached): [1] compiler_4.1.3 magrittr_2.0.2 plyr_1.8.6 R6_2.5.1 tools_4.1.3 httpcode_0.3.0 curl_4.3.2 [8] urltools_1.7.3 Rcpp_1.0.8.3 triebeard_0.3.0 xml2_1.3.3 stringi_1.7.6 reshape_0.8.8 crul_1.2.0 [15] stringr_1.4.0 jsonlite_1.8.0 ```