Closed aphalo closed 3 years ago
Same example code results in the same crash on a Windows workstation. Intel Xeon E31235, 24GB RAM. NVIDIA Quadro K2000. A slightly newer release of Windows 10 Pro OS (1942, November 2020). In this case 'ggperel' installed from CRAN.
Here is the output from sessionInfo() in my R session:
R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggrepel_0.9.0 ggplot2_3.3.3
loaded via a namespace (and not attached):
[1] Rcpp_1.0.5 withr_2.3.0 dplyr_1.0.2 crayon_1.3.4
[5] grid_4.0.3 R6_2.5.0 lifecycle_0.2.0 gtable_0.3.0
[9] magrittr_2.0.1 scales_1.1.1 pillar_1.4.7 rlang_0.4.10
[13] rstudioapi_0.13 generics_0.1.0 vctrs_0.3.6 ellipsis_0.3.1
[17] tools_4.0.3 glue_1.4.2 purrr_0.3.4 munsell_0.5.0
[21] compiler_4.0.3 pkgconfig_2.0.3 colorspace_2.0-0 tidyselect_1.1.0
[25] tibble_3.0.4
I have a few busy days until the weekend. I will try to have a closer look at the code during the coming weekend. This example gives at least a 'reprex' for what I informally mentioned a few days ago in a comment.
Oh no! Sorry for the crashes.
I am surprised, because CRAN reports that Windows is OK: https://www.r-project.org/nosvn/R.check/r-devel-windows-ix86+x86_64/ggrepel-00check.html
CRAN Package Check Results for Package ggrepel Last updated on 2021-01-05 00:47:43 CET.
Flavor | Version | Tinstall | Tcheck | Ttotal | Status | Flags |
---|---|---|---|---|---|---|
r-devel-linux-x86_64-debian-clang | 0.9.0 | 26.52 | 67.43 | 93.95 | OK | |
r-devel-linux-x86_64-debian-gcc | 0.9.0 | 17.58 | 50.18 | 67.76 | OK | |
r-devel-linux-x86_64-fedora-clang | 0.9.0 | 121.19 | OK | |||
r-devel-linux-x86_64-fedora-gcc | 0.9.0 | 108.31 | OK | |||
r-devel-windows-ix86+x86_64 | 0.9.0 | 63.00 | 140.00 | 203.00 | OK | |
r-patched-linux-x86_64 | 0.9.0 | 21.82 | 62.55 | 84.37 | OK | |
r-patched-solaris-x86 | 0.9.0 | 136.90 | OK | |||
r-release-linux-x86_64 | 0.9.0 | 22.25 | 61.72 | 83.97 | OK | |
r-release-macos-x86_64 | 0.9.0 | OK | ||||
r-release-windows-ix86+x86_64 | 0.9.0 | 64.00 | 137.00 | 201.00 | OK | |
r-oldrel-macos-x86_64 | 0.9.0 | OK | ||||
r-oldrel-windows-ix86+x86_64 | 0.9.0 | 45.00 | 118.00 | 163.00 | OK |
This is puzzling, and makes me wonder about your system configuration ...
You're reporting that this example from the examples.Rmd
vignette is crashing your Windows systems.
In the CRAN Windows log, we can see that building the vignette seems to work:
checking files in 'vignettes' ... OK
checking examples ...
running examples for arch 'i386' ... [4s] OK
running examples for arch 'x64' ... [4s] OK
checking for unstated dependencies in 'tests' ... OK
checking tests ...
running tests for arch 'i386' ... [13s] OK
Running 'testthat.R' [13s]
running tests for arch 'x64' ... [15s] OK
Running 'testthat.R' [15s]
checking for unstated dependencies in vignettes ... OK
checking package vignettes in 'inst/doc' ... OK
checking re-building of vignette outputs ... [4s] OK
checking PDF version of manual ... OK
DONE
Status: OK
In this past, I fixed a heap buffer overflow that was detected when CRAN reported the output from ASAN: https://github.com/slowkow/ggrepel/commit/cad83bb0cb0e0426caff8bc4bd97003f6c88e5c7
It is possible that running address sanitizer (ASAN) might help to discover which lines are causing the problem on your system. Here is a blog post about that: https://knausb.github.io/2017/06/validating-asan/
Here is the way I recommend you try running ASAN on your machine. Brodie Gaslam shared these steps with me in the past, and it worked for me:
Downlod Winston Chang's r-debug docker container on my OS X macbook:
https://hub.docker.com/r/wch1/r-debug/
Here are the steps I followed:
Install docker (I did this ages ago, so don't remember the details)
Start docker
cd <ggrepel source dir>
docker pull wch1/r-debug
docker run --rm -ti --security-opt seccomp=unconfined -v $(pwd):/mydir wch1/r-debug
RDsan -e "install.packages(c('ggplot2', 'knitr', 'rmarkdown', 'gridExtra', 'prettydocs'))"
cd /mydir
RDsan CMD build .
Hi, No problem about the crashes! I know some of my examples push 'ggrepel' to its limits. Your package was such a breakthrough when you first released it and it remains so useful!
The unedited example from the vignette does not crash in either of my computers. My example code for reproducing the crash is modified from that in the package documentation to use the whole of mtcars
instead of a subset of it. It must be much harder work to repel these many labels than the three in the original example. Maybe there is not even enough space for all labels, but still a crash is not good... The problem appears only when one has a large number of unlabelled points (1000's) and a fairly large number of labelled points (10's). There may be a memory leak or something similar. Some hint as to what maybe behind is that I did not change the number of unlabelled points but rather the number of labelled points. This roughly matches what I was seeing with the spectral data. I haven't noticed this with versions of 'ggrepel' before 0.9.0.
The 'ggrepel' package does pass check --as-cran on my own laptop. This agrees with your experience. The crash appears only in some cases, and with large data sets for which there are no examples or tests in the package. I first noticed the problem when trying to build the vignette for my own package 'ggspectra'. It is odd that 'ggspectra' passes its tests on CRAN, but I do not really know if the daily checks on CRAN rebuild the vignettes or not.
If my example code does not fail under other operating systems, the problem maybe elsewhere. I will install an earlier version of 'ggrepel' to make sure that this is not an old problem, or a problem triggered by something different in my both of my own computers. (I should set up continuous integration in Github for my packages... as you have done for 'ggrepel'.) I will also try to see if the problem is not coming from Windows, Rcpp or RTools updates.
Best wishes,
Pedro.
I have an RStudio Cloud account but I did not think earlier of using it for testing...
I can confirm that there is no crash under Ubuntu, just a figure being rendered quickly, but with several overlapping labels. This is in an instance with rather little RAM at 1GB, and this RAM is reported as far from being fully in use.
I need to explore how to use ASAN under Windows with the gnu compilers (everything is quite a patch under Windows and gnu compiler versions used are much older than under Linux). I will try a few less time-consuming things before embarking in this.
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.7 LTS
Matrix products: default
BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 LC_COLLATE=C.UTF-8
[5] LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 LC_PAPER=C.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggrepel_0.9.0 ggplot2_3.3.3
loaded via a namespace (and not attached):
[1] Rcpp_1.0.5 withr_2.3.0 crayon_1.3.4 grid_4.0.3 R6_2.5.0 lifecycle_0.2.0
[7] gtable_0.3.0 magrittr_2.0.1 scales_1.1.1 pillar_1.4.7 rlang_0.4.10 rstudioapi_0.13
[13] vctrs_0.3.6 ellipsis_0.3.1 tools_4.0.3 glue_1.4.2 munsell_0.5.0 compiler_4.0.3
[19] pkgconfig_2.0.3 colorspace_2.0-0 tibble_3.0.4
@slowkow Now I installed R 4.1.0.pre (today'd R devel build) and I see no crash in my laptop, and if I switch back to R 4.0.3 I get again the crash.
R Under development (unstable) (2021-01-04 r79789)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_Finland.1252 LC_CTYPE=English_Finland.1252
[3] LC_MONETARY=English_Finland.1252 LC_NUMERIC=C
[5] LC_TIME=English_Finland.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggrepel_0.9.0 ggplot2_3.3.2
loaded via a namespace (and not attached):
[1] Rcpp_1.0.5 withr_2.2.0 dplyr_1.0.1 crayon_1.3.4 grid_4.1.0
[6] R6_2.4.1 lifecycle_0.2.0 gtable_0.3.0 magrittr_1.5 scales_1.1.1
[11] pillar_1.4.6 rlang_0.4.7 rstudioapi_0.11 generics_0.0.2 vctrs_0.3.2
[16] ellipsis_0.3.1 tools_4.1.0 glue_1.4.1 purrr_0.3.4 munsell_0.5.0
[21] parallel_4.1.0 compiler_4.1.0 pkgconfig_2.0.3 colorspace_1.4-1 tidyselect_1.1.0
[26] tibble_3.0.3
@slowkow I thought I had caught the bug, but not. All I can say that it is extremely weird and specific to Windows. There seems to be something to do with Windows or R, or package installation. I can "cure" the crashes for a while by downgrading from 'tibble' 3.0.4 to 'tibble' 3.0.3, but then they soon repear, and if I restart Windows and upgrade from 'tibble' 3.0.3 to 'tibble' 3.0.4' things work normally for a while in the same R session. At least once, doing the trick with 'tibble' did not help, but downgrading in addition from 'vcrts' 0.3.6 to 'vctrs' 0.3.5 solved the crashes for a while. This is rather perplexing! I haven't tried doing this trick with other packages than these two, I will try in the evening. I see this same thing both under R 4.0.3 and R devel. 'tibble' and 'vctrs' seemed good candidates as they had different versions in the tests above.
I tried another thing: re-installing 'tibble' 3.0.4 also cures the crashes for the current R session.
[Some time later:] This trick has now stopped working... I am again getting crashes consistently... I cannot really make any sense of what is going on...
Oh boy, this sounds like a frustrating experience. Sorry for all the trouble, Pedro! I've been in this kind of debugging hell before, and it can be a nightmare.
It sounds like you tried a number of different things, which is great, but also complex to think about.
An important note for myself is that your example is similar but not identical to the example in the vignette. I totally overlooked the fact that you're using the entire mtcars
, not just 3 of the cars like in my vignette. Sorry for overlooking this! I believe there is some chance that the current CRAN check for Windows might be insufficient — maybe CRAN Windows would crash with your example code.
I think I should include your crashing example in the tests, and then we can see if it runs on all platforms (CI on Github, and eventually CRAN). I hope this will eventually help to diagnose if this is indeed a problem with 'ggrepel' (I suspect that it is).
With your crashing example in-hand, I should be able to reproduce your crash somewhere... I need some time to try this. Once I migrate the CRAN check from 'Travis CI' to 'Github Actions', then we will be able to test ggrepel on Windows, Linux, and macOS after every commit.
TODO:
I'm sorry you have to deal with crashes, and I'll try to prioritize this. But I'm juggling a few things right now 🤹♀️, so it may be a little while.
I am very happy to help as much as my available time allows. Having looked at both the R and C++ code in 'ggrepel' and having not too long ago written a geom_grob()
for 'ggpmisc' I will try to write a geom_grob_repel()
using 'ggrepel's machinery. It may be a more difficult task than what it looks at first sight, but my impression is that it would be quite easy to implement in 'ggrepel'.
My package did not have CI set up before. I just added git actions to 'gginnards' and 'ggpmisc' with package 'usethis'.
I run the following statement and then committed and pushed, and the first check run on Github (visible in the actions page of the repo).
usethis::use_github_action("check-standard")
This and other examples are at https://github.com/r-lib/actions/tree/master/examples
I started by reading this post https://ropensci.org/technotes/2020/11/19/moving-away-travis/
Hopefully this also works for you.
@aphalo Thank you for the tip about usethis
! That makes it easy. I think it is working.
We'll find out if GitHub Actions on Windows is crashing with your test code at this link:
@aphalo I think the test completed on GitHub Actions Windows without crashing:
* checking for unstated dependencies in 'tests' ... OK
* checking tests ...
** running tests for arch 'i386' ...
Running 'testthat.R'
OK
** running tests for arch 'x64' ...
Running 'testthat.R'
OK
This makes me think that there must be something special about your machines, and I don't have any idea what it could be.
At this point, I believe that the best way to diagnose your crash on Windows is to run ASAN on your machine. Unfortunately, I don't have a Windows machine and I can't offer any advice for setting this up. I bet that the folks at RStudio Community can help you with this.
It seems that there is a way to run ASAN on Windows: https://devblogs.microsoft.com/cppblog/addresssanitizer-asan-for-windows-with-msvc/
I'm not familiar with MSVC or compiling code on Windows, so I can't offer more detailed guidance.
I will install an earlier version of 'ggrepel' to make sure that this is not an old problem
@aphalo Could I please ask if you have any previous version of ggrepel that did not crash on Windows?
I am wondering if we added something recently to the .cpp
code that does not work well on Windows...
If we can find the offending lines, maybe we can add an if-statement to check if the platform is Windows and exclude those lines from compilation. Unfortunately, this is a bit of a ghost hunt if we can't generate any error messages.
@aphalo In addition to testing ggrepel 0.8.2, could I ask if you might find this helpful to uncover the reason for the crash on your Windows machine?
https://github.com/r-hub/rhub/issues/442#issuecomment-758163028
@slowkow for crashes, drmingw often works well! Try in rtools:
pacman -S mingw-w64-x86_64-drmingw
And then install drmingw as the default crash debugger as described here:
drmingw -i -v
You if you make it crash in RGui or RStudio, drmingw will popup and show a stack trace (it will not work in terminal-only Rterm sessions because these will not be able to spawn graphical windows)
Sorry! Yesterday, I missed your comment. Here is the dump. ggrepel-crash-dump-test.txt
@aphalo Thank you! Based on your crash dump, I think we can rest assured the ggrepel is the culprit.
LOAD_DLL PID=2100 TID=5588 lpBaseOfDll=000000001CBC0000 dplyr.dll
LOAD_DLL PID=2100 TID=5588 lpBaseOfDll=000000006ABC0000 Rcpp.dll
LOAD_DLL PID=2100 TID=5588 lpBaseOfDll=0000000062E40000 ggrepel.dll
LOAD_DLL PID=2100 TID=5588 lpBaseOfDll=0000000070E80000 farver.dll
CREATE_THREAD PID=2100 TID=2804
EXCEPTION PID=2100 TID=2804 ExceptionCode=0x80000003 dwFirstChance=1
EXIT_THREAD PID=2100 TID=2804 dwExitCode=0x0
EXCEPTION PID=2100 TID=5588 ExceptionCode=0xc0000005 dwFirstChance=0
Rterm.exe caused an Access Violation at location 0000000062E4829A in module ggrepel.dll Reading from location 000000001526F008.
AddrPC Params
0000000062E4829A 00000000043BEA80 000000001E3AD3C0 00000000043BEA30 ggrepel.dll!repel_boxes2
0000000062E43CEA 000000002088DD98 000000002087A3E0 00000000043BEDF0 ggrepel.dll!_ggrepel_repel_boxes2
000000006C7A6E3E 0000000000000000 00000000043BEFD8 0000000000000001 R.dll!Rf_NewFrameConfirm
000000006C7A7E81 000000001B8B43F8 0000000000000000 000000001B8B43F8 R.dll!Rf_NewFrameConfirm
000000006C7FBDF8 0000000004471EF0 000000001EAB7090 0000000000000000 R.dll!Rf_eval
000000006C7FF547 000000001B8B43C0 000000000447B180 000000001B8B43C0 R.dll!R_execMethod
000000006C7FBC25 00000000158C2390 000000006E62ECB0 000000001C287CB8 R.dll!Rf_eval
I searched for "Rterm.exe caused an Access Violation at location"
and found a similar issue from 2003.
I'm puzzled why this is happening on Windows but not Linux or macOS.
Could I please ask if you can try to reproduce the crash with a patched version of the code?
remotes::install_github("slowkow/ggrepel@windows-crash")
I hope that fixes it! 🤞
I am afraid it did not. ggrepel-crash-dump-windows-test.txt
I found this: https://www.gitmemory.com/issue/r-lib/callr/178/753449514 So, the way to go might be to build the package with debug information, by setting the compiler flags. I need to try this again, and if I can coerce the compiler to read the Makevars.win file or pass the information with an environment variable, if we are lucky the dump should include source line numbers. It is getting late here in Finland, so I will try this tomorrow, most likely in the evening.
Thank you, that's a good find! I would find it very helpful if we can assign some line numbers. I tried to catch all the possible lines where there might be an "access violation," but I guess I must have missed something.
Compiling the package with debug information did not help. I guess I need to also build R from sources with debug information. From the instructions it does not seem too difficult, but I have not done this before under Windows, so do keep fingers crossed...
@aphalo I'm sorry for asking for so much work. I really appreciate your help.
Before you embark on the (probably) lengthy journey of compiling R, I want to let you know that I might have caught another access violation. I pushed it to the windows-crash
branch, and you can see all my changes at this link: https://github.com/slowkow/ggrepel/compare/windows-crash
I think it would be less work for you to test the new patched code before you recompile R.
Could I please ask if you can try again to reproduce the crash with a patched version of the code?
remotes::install_github("slowkow/ggrepel@windows-crash")
At this Stackoverflow post, Peter Alexander said:
Somewhere in your code you must be creating Vertex and Triangle objects and calling functions on them. At some point, you are probably calling the function with a pointer that hasn't been initialised.
I think I found an example of this in my code. On this line, I'm creating a vector of Point
objects, but I haven't initialized them:
In the windows-crash
branch, I initialize the Point
objects to {0, 0}
:
Does not seem to help. It crashes still by running the test case file on its own with 100 000 points. Without a local copy of the source I cannot run checks. If you want I can pull this branch and test, but the error in the dump file is the same. I noticed that when compiling with debug information, even if no line number is provided, I get the name of the function. So here is another dump file. ggrepel-crash-dump-windows-test.txt
EXCEPTION PID=12168 TID=14668 ExceptionCode=0xc0000005 dwFirstChance=0
Rterm.exe caused an Access Violation at location 0000000062E4829A in module ggrepel.dll Reading from location 000000001B18E068.
AddrPC Params
0000000062E4829A 00000000043CE490 000000006E51109C 00000000043CE440 ggrepel.dll!repel_boxes2
0000000062E43CEA 00007FF4F7E80010 00007FF4F8010010 00000000043CE800 ggrepel.dll!_ggrepel_repel_boxes2
Just checked out the windows.crash branch and the problem seems now to be solved! I even tried 1 000 000 points and there was no crash.
Here's how you can get the code:
git clone https://github.com/slowkow/ggrepel.git
cd ggrepel
git checkout windows-crash
open src/repel_boxes.cpp
Just checked out the windows.crash branch and the problem seems now to be solved! I even tried 1 000 000 points and there was no crash.
Wonderful news! Thanks so much for your patience and for your diligence in testing! 🙏
Note for the future: to run tests within RStudio one needs to check out the source, Installing from GitHub does not work because RStudio/devtools installs from the local sources if the installed binary does not match. This lead me stray, thinking that the new branch did not solve the problem!
Summary
The following code example crashes R, without issuing any error message. Crashes both at the R GUI and in RStudio. This example fails consistently on my laptop. In other cases, with my spectral data, the crashes seem to depend on the state of the operating system or something else that gets reset during rebooting. I run this on a recent model laptop with 16 GB RAM and an AMD Ryzen 7 Pro 3700U processor.
'ggrepel' installed locally from source.
Minimal code example
Here is the minimum amount of code needed to demonstrate the issue:
Here is an image of the output produced by the code:
No output produced.
Suggestions
I do not have a proposal at this time.
Version information
Here is the output from
sessionInfo()
in my R session: