ropensci / spelling

Tools for Spell Checking in R
https://docs.ropensci.org/spelling
Other
105 stars 27 forks source link

"PCDATA invalid Char value" error #62

Closed IndrajeetPatil closed 2 years ago

IndrajeetPatil commented 2 years ago

If I try to spell check this file, I get the following error:

> spelling:::spell_check_file_md("README.md")
Error in read_xml.raw(charToRaw(enc2utf8(x)), "UTF-8", ..., as_html = as_html,  : 
  PCDATA invalid Char value 27 [9]

I tried to debug this further, but couldn't manage to find the offending text. If it's of any help, this is traceback I see:

> traceback()
7: read_xml.raw(charToRaw(enc2utf8(x)), "UTF-8", ..., as_html = as_html, 
       options = options)
6: read_xml.character(md)
5: xml2::read_xml(md)
4: xml_find_all(x, "//namespace::*[name()='']/parent::*")
3: xml2::xml_ns_strip(xml2::read_xml(md))
2: parse_text_md(path)
1: spelling:::spell_check_file_md("README.md")

Also, here is my session information:

Session info ``` r sessioninfo::session_info() #> - Session info -------------------------------------------------------------- #> hash: women holding hands: dark skin tone, open hands: light skin tone, thermometer #> #> setting value #> version R version 4.1.1 (2021-08-10) #> os Windows 10 x64 (build 19043) #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate English_United Kingdom.1252 #> ctype English_United Kingdom.1252 #> tz Europe/Berlin #> date 2021-10-09 #> pandoc 2.14.2 @ C:/PROGRA~1/Pandoc/ (via rmarkdown) #> #> - Packages ------------------------------------------------------------------- #> package * version date (UTC) lib source #> backports 1.2.1 2020-12-09 [1] CRAN (R 4.1.0) #> cli 3.0.1 2021-07-17 [1] CRAN (R 4.1.0) #> crayon 1.4.1 2021-02-08 [1] CRAN (R 4.1.1) #> digest 0.6.28 2021-09-23 [1] CRAN (R 4.1.1) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.1) #> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.1) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.1) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.1) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.1.1) #> highr 0.9 2021-04-16 [1] CRAN (R 4.1.1) #> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.1) #> knitr 1.36.3 2021-10-09 [1] Github (yihui/knitr@00469e0) #> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.1) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.1) #> pillar 1.6.3 2021-09-26 [1] CRAN (R 4.1.1) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.1) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.1.1) #> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.1.1) #> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.1.0) #> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.1.0) #> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.1.1) #> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.1) #> rlang 0.4.11 2021-04-30 [1] CRAN (R 4.1.1) #> rmarkdown 2.11.3 2021-10-09 [1] Github (rstudio/rmarkdown@5a3e941) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.1) #> sessioninfo 1.1.1.9000 2021-10-09 [1] Github (r-lib/sessioninfo@1ff2194) #> spelling * 2.2 2020-10-18 [1] CRAN (R 4.1.1) #> stringi 1.7.5 2021-10-04 [1] CRAN (R 4.1.1) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.1) #> styler 1.6.2.9000 2021-10-08 [1] Github (r-lib/styler@7c46e20) #> tibble 3.1.5 2021-09-30 [1] CRAN (R 4.1.1) #> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.1) #> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.1) #> withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.1) #> xfun 0.26 2021-09-14 [1] CRAN (R 4.1.1) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) #> #> [1] C:/Users/IndrajeetPatil/Documents/R/win-library/4.1 #> [2] C:/Program Files/R/R-4.1.1/library ```
IndrajeetPatil commented 2 years ago

Can no longer reproduce this, so closing this issue.

tjmahr commented 2 years ago

Fwiw, I got this error on a file that had written out a bunch of ANSI highlighting:

[90m---
[39m[32mtitle[39m[90m:[39m Notebook Title
[32mauthor[39m[90m:[39m Author Name
[32mdate[39m[90m:[39m |
  `r knitr::inline_expr('format(Sys.time(), "Updated on %A, %B %d, %Y %I[90m:[39m%M %p")')`
[32msite[39m[90m:[39m bookdown::bookdown_site
[32mlink-citations[39m[90m:[39m true
[90m---
[39m

Maybe this will help someone else in the future.

megha-gen commented 1 year ago

while retrieving the results am getting as Error in read_xml.raw(charToRaw(enc2utf8(x)), "UTF-8", ..., as_html = as_html, : PCDATA invalid Char value 27 [9] How to debug this?