Closed yutannihilation closed 3 years ago
That's interesting.
The command parzer::parse_lon("10")
and the gist you prepared (minimal_rcpp_code_to_freeeze_windows.R
) freeze on my Windows laptop. I often use parzer::parse_lon on large character vectors, i find it slow but never that slow, I don't get what's happening here.
@sckott preferred to avoid dependence on boost but maybe that changed. I would be curious to try your fix.
I often use parzer::parse_lon on large character vectors, i find it slow but never that slow
Oh, curious. So, maybe it means, when the memory pressure is so high, the allocation for regex cache fails early? I wonder to what extent it speeds up if you set Sys.setlocale("LC_COLLATE", "C")
.
Thanks for the issue. I don't have a Windows machine, but will see if we can replicate on CI at least.
By the way, I played with replacing Rcpp with cpp11 https://github.com/ropensci/parzer/issues/27 - not done yet though. I wonder if that would solve this problem
will see if we can replicate on CI at least.
I'm afraid we can't. I'm not sure if GitHub Actions' Windows runner has Language Pack of MBCS locale installed.
I wonder if that would solve this problem
No, it won't. As I described above, this is a GCC's bug. cpp11 (or Rcpp) doesn't save us from using std::regex
.
I'm afraid we can't.
Sorry, I was wrong. It seems it's reproducible on GitHub Actions CI:
Glad its reproducible on CI.
I'd rather not use boost as it's such a heavy dependency for folks that install from source. Maybe we try withr first and see how that goes. If it's not satisfactory we can try boost
Thanks, agreed. Then may I try a pull request? I'm not confident with my C++ skill, but if it's R code maybe I can do it, and since I have Windows machine with such a locale, it's easier to test.
Yes, PR please.
Sure, maybe I'll do it tomorrow or this weekend :)
@sckott I made a pull request here: #32. Could you take a look when you have time? Since I implemented it in a bit tricky way, I'm curious about what you think...
thanks, yes i will
parzer's functions using
std::regex
hang (or, to be precise, take a very long time) on Windows. Here's an example:It turned out this is due to some bug in GCC, which is unlikely to see the fix in Rtools very soon. Possible workarounds are:
boost::regex
std::regex
withwithr::with_locale(new = c(LC_COLLATE = "C"), ...)
Do you think either is acceptable? I'm happy to create a pull request and test with my Windows laptop.
For the details, please refer to the Twitter thread: https://twitter.com/opencpu/status/1353000471062048771