tidyverse / rvest

Simple web scraping for R
https://rvest.tidyverse.org
Other
1.49k stars 341 forks source link

Try and figure out what's up with encoding guessing #368

Closed hadley closed 1 year ago

hadley commented 1 year ago

@gagolews any idea why stringi::stri_enc_detect() would return different values on different operating systems?

I'm generally regretful of providing this interface in the first place, so I might just remove the tests or deprecate the whole thing.

gagolews commented 1 year ago

I think it – unfortunately – strongly depends on the version of ICU installed... The heuristics behind the hood are highly imperfect, so I don't recommend being overly strict wrt the reproducibility of the results they generate.

hadley commented 1 year ago

Yeah, that's what I figured. I think the best bet is just to always skip these tests, which I've done on main.