Closed fweber144 closed 2 years ago
This issue does not occur on a different machine with:
RStudio Edition : Desktop
RStudio Version : RStudio 2022.07.1+554 "Spotted Wakerobin" Release (7872775ebddc40635780ca1ed238934c3345c5de, 2022-07-22) for Ubuntu Jammy; Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) QtWebEngine/5.12.8 Chrome/69.0.3497.128 Safari/537.36
OS Version : Ubuntu 22.04.1 LTS
R Version : 4.2.1 (2022-06-23)
However, on a different Windows machine (different from the one in the initial post), it is reproducible. In contrast to the machine from the initial post, that other machine has Windows 11. So this seems to be a Windows-related bug, irrespective of whether it's Windows 10 or Windows 11. In any case, I think it's crucial because I use the "Find in Files" dialog a lot for navigating around in R packages.
@fweber144 Thank you for raising this! I'm unable to reproduce the problem though, using Windows 11. I tried a few different versions and combinations, but it works. So, a couple of questions:
1) Do the records show up correctly for other files? For example, are the records in testers.R
showing up correctly?
2) Have you used earlier versions of RStudio Desktop, and if so was it working in a previous version?
3) What happens if you uncheck "case sensitive"?
4) Similarly, what happens if you look at all files instead of only source files?
Thank you for investigating this, @ronblum. Concerning your questions:
- Do the records show up correctly for other files? For example, are the records in
testers.R
showing up correctly?
Yes, for other files up in the lists, the results seem to be correct. However, it seems like the results list is "cut off" at the bottom (see my reply to point 4 below). So for files which would be further down the list, results are not shown (by coincidence, this folder does not have an alphabetically later file which contains d_test
, but if I put d_test
at the end of line 1 of vignettes/projpred.Rmd
for example, this occurrence is not shown).
- Have you used earlier versions of RStudio Desktop, and if so was it working in a previous version?
I think it was working correctly at some point in the past (because I'm using "Find in Files" a lot and never encountered any issue, but I might just not have realized this). But even if it was working correctly in the past, I can't say when this changed—sorry.
- What happens if you uncheck "case sensitive"?
No changes, the issue still persists. However, this does have an impact when looking at all files (not just source files), see point 4 below.
- Similarly, what happens if you look at all files instead of only source files?
When having "case sensitive" unchecked and looking at all files, the search results are cut off even earlier, namely at line 1254 of tests/testthat/setup.R
. There are no results from tests/testthat/test_varsel.R
(or vignettes/projpred.Rmd
, if modified as described above) at all anymore. I guess the reason could be that there are more results at the beginning of the results list, so that there is some kind of hidden maximum for the number of results shown.
Interestingly, when having "case sensitive" checked and looking at all files, then the results list seems to be complete, in particular, also including line 109 from tests/testthat/test_varsel.R
and later occurrences (i.e., also line 1 from vignettes/projpred.Rmd
, if modified as described above). Perhaps this helps?
@fweber144 Thank you for looking further into this!
@jgutman I haven't been able to reproduce the issue (so far). Do you have any suggestions of what to look at or what might be going on?
@fweber144 the project you are searching is a git repository correct? then the behavior of find in files on windows may depend on the version of git you have installed on your windows platform. Could you please report what version of git you have running on the Windows machine where this fails?
system("git --version")
should be the easiest way to check.For others who have run into issues on Windows when using Find in Files in a git repository, we've noticed they have a very old version of Git for Windows installed on their system, and updating Git often helps
Thank you for helping here, @jgutman.
the project you are searching is a git repository correct?
No, not really. It is taken from a GitHub repo (my fork of the stan-dev projpred repo at commit https://github.com/fweber144/projpred/commit/ddaf3e976c871e083d22d0381f82d3d2414abeef), but the ZIP file I linked above doesn't include a .git
folder (not even a Windows-hidden one). It also doesn't have a .Rproj.user
folder. The .github
folder only contains FUNDING.yml
. In RStudio's global options, I have version-control systems deactivated.
Could you please report what version of git you have running on the Windows machine where this fails?
My Git version is 2.37.1.windows.1
.
- Do you have the [ ] Exclude files matched by .gitignore option set? (Is your project managed by git?)
Sorry, I'm not sure if I understand you correctly here. You mean the "Exclude these files" checkbox in the "Find in Files" dialog? I have not checked it. When checking it, no files (or patterns) are listed there. Concerning your question about Git management: Does my explanation above (the origin is a Git repo, but the ZIP file should not be affected by that) answer it?
- What version of Git for Windows do you have installed? If git is on the PATH, then
system("git --version")
should be the easiest way to check.
See above.
- If you run the code Sys.setenv(RSTUDIO_GREP_DEBUG = 1) in the console, and then perform the search, what debug output do you see in the console?
When following the steps from https://github.com/rstudio/rstudio/issues/11736#issue-1334717833, I get:
> Sys.setenv(RSTUDIO_GREP_DEBUG = 1)
"C:/Program Files/RStudio/bin/gnugrep/3.0/grep" "--binary-files=without-match" "-rHn" "--color=always" "-F" "-f" "C:/Users/<user_name>/AppData/Local/Temp/RtmpsxOA8f/rs_grep24d439171ee6.txt" "--include=*.r" "--include=*.R" "--include=*.rnw" "--include=*.Rnw" "--include=*.rmd" "--include=*.Rmd" "--include=*.rmarkdown" "--include=*.Rmarkdown" "--include=*.qmd" "--include=*.Qmd" "--include=*.md" "--include=*.rhtml" "--include=*.Rhtml" "--include=*.h" "--include=*.hpp" "--include=*.c" "--include=*.cpp" "--include=*.js" "--include=*.yml" "--include=*.yaml"
stdout: NEWS.md:17:* Argument `d_test` of `varsel()` is not considered as an internal feature anymore. This was possible after fixing a bug for `d_test` (see below). (GitHub: #341)
NEWS.md:18:* The order of the observations in the subelements of `<vsel_object>$summaries` and `<vsel_object>$d_test` now corresponds to the order of the observations in the original dataset if `<vsel_object>` was created by a call to `cv_varsel([...], cv_method = "kfold")` (formerly, in that case, the observations in those subelements were ordered by fold). Thereby, the order of the observations in those subelements now always corresponds to the order of the observations in the original dataset, except if `<vsel_object>` was created by a call to `varsel([...], d_test = <non-NULL_d_test_object>)`, in which case the order of the observations in those subelements corresponds to the order of the observations in `<non-NULL_d_test_object>`. (GitHub: #341)
NEWS.md:30:* Fix argument `d_test` of `varsel()`: Not only the predictive performance of the *reference model* needs to be evaluated on the test data, but also the predictive performance of the *submodels*. (GitHub: #341)
R/cv_varsel.R:243: d_test = sel_cv$d_test,
R/cv_varsel.R:527: d_test <- list(type = "LOO", data = NULL, offset = refmodel$offset,
R/cv_varsel.R:530: out_list <- nlist(solution_terms_cv = solution_terms_mat, summaries, d_test)
R/cv_varsel.R:549: d_test <- list(
R/cv_varsel.R:555: return(nlist(refmodel = fold$refmodel, d_test))
R/cv_varsel.R:615: test_points = fold$d_test$omitted)
R/cv_varsel.R:625: fold$d_test$omitted
R/cv_varsel.R:642: newdata = refmodel$fetch_data(obs = fold$d_test$omitted)
R/cv_varsel.R:643: ) + fold$d_test$offset
R/cv_varsel.R:646: y_test = fold$d_test, family = fold$refmodel$family,
R/cv_varsel.R:657: list(offset = fold$d_test$offset,
R/cv_varsel.R:658: weights = fold$d_test$weights,
R/cv_varsel.R:659: y = fold$d_test$y)
R/cv_varsel.R:667: d_test = c(list(type = "kfold", data = NULL), d_cv)))
R/methods.R:410: nobs_test <- nrow(object$d_test$data %||% object$refmodel$fetch_data())
R/methods.R:566: nobs_test = nrow(object$d_test$data),
R/misc.R:7:nms_d_test <- function() {
R/summary_funs.R:56: !all(varsel$d_test$weights == 1)) {
R/summary_funs.R:57: varsel$d_test$y_prop <- varsel$d_test$y / varsel$d_test$weights
R/summary_funs.R:83: res <- get_stat(summ$mu, summ$lppd, varsel$d_test, stat, mu.bs = mu.bs,
R/summary_funs.R:86: data = varsel$d_test$type, size = Inf, delta = delta, statistic = stat,
R/summary_funs.R:100: res_ref <- get_stat(summ_ref$mu, summ_ref$lppd, varsel$d_test,
R/summary_funs.R:103: res_diff <- get_stat(summ$mu, summ$lppd, varsel$d_test, stat,
R/summary_funs.R:111: data = varsel$d_test$type, size = k - 1, delta = delta,
R/summary_funs.R:117: res <- get_stat(summ$mu, summ$lppd, varsel$d_test, stat, mu.bs = mu.bs,
R/summary_funs.R:119: diff <- get_stat(summ$mu, summ$lppd, varsel$d_test, stat,
R/summary_funs.R:123: data = varsel$d_test$type, size = k - 1, delta = delta,
R/summary_funs.R:145:## `d_test$weights`. These are already taken into account by
R/summary_funs.R:149:get_stat <- function(mu, lppd, d_test, stat, mu.bs = NULL, lppd.bs = NULL,
R/summary_funs.R:180: if (is.null(d_test$y_prop)) {
R/summary_funs.R:181: y <- d_test$y
R/summary_funs.R:183: y <- d_test$y_prop
R/summary_funs.R:185: if (!all(d_test$weights == 1)) {
R/summary_funs.R:186: wcv <- wcv * d_test$weights
R/summary_funs.R:236: y <- d_test$y
R/summary_funs.R:237: if (!is.null(d_test$y_prop)) {
R/summary_funs.R:240: # `d_test$weights` contains the numbers of trials) with more than 1 trial
R/summary_funs.R:242: stopifnot(all(.is.wholenumber(d_test$weights)))
R/summary_funs.R:244: stopifnot(all(0 <= y & y <= d_test$weights))
R/summary_funs.R:246: c(rep(0L, d_test$weights[i_short] - y[i_short]),
R/summary_funs.R:249: mu <- rep(mu, d_test$weights)
R/summary_funs.R:251: mu.bs <- rep(mu.bs, d_test$weights)
R/summary_funs.R:253: n_notna <- sum(d_test$weights)
R/summary_funs.R:254: wcv <- rep(wcv, d_test$weights)
R/summary_funs.R:257: stopifnot(all(d_test$weights == 1))
R/varsel.R:13:#' @param d_test A `list` of the structure outlined in section "Argument
R/varsel.R:14:#' `d_test`" below, providing test data for evaluating the predictive
R/varsel.R:88:#' # Argument `d_test`
R/varsel.R:90:#' If not `NULL`, then `d_test` needs to be a `list` with the following
R/varsel.R:188:varsel.refmodel <- function(object, d_test = NULL, method = NULL,
R/varsel.R:217: if (is.null(d_test)) {
R/varsel.R:218: d_test <- list(type = "train", data = NULL, offset = refmodel$offset,
R/varsel.R:221: d_test$type <- "test"
R/varsel.R:222: d_test <- d_test[nms_d_test()]
R/varsel.R:247: newdata = d_test$data,
R/varsel.R:248: offset = d_test$offset,
R/varsel.R:249: wobs = d_test$weights,
R/varsel.R:250: y = d_test$y)
R/varsel.R:258: nobs_test <- nrow(d_test$data %||% refmodel$fetch_data())
R/varsel.R:261: if (d_test$type == "train") {
R/varsel.R:269: newdata_for_ref <- d_test$data
R/varsel.R:274: "`d_test$data`, but that column already exists. Please rename ",
R/varsel.R:275: "this column in `d_test$data` and try again.")
R/varsel.R:277: newdata_for_ref$projpred_internal_offs_stanreg <- d_test$offset
R/varsel.R:281: d_test$offset
R/varsel.R:285: y_test = d_test, family = refmodel$family, wsample = refmodel$wsample,
R/varsel.R:294: d_test,
tests/testthat/helpers/testers.R:1079:# @param dtest_expected If `vs` was created with a non-`NULL` argument `d_test`
tests/testthat/helpers/testers.R:1081:# `vs$d_test` object. Otherwise, this needs to be `NULL`.
tests/testthat/helpers/testers.R:1256: # d_test
tests/testthat/helpers/testers.R:1258: expect_type(vs$d_test, "list")
tests/testthat/helpers/testers.R:1259: expect_named(vs$d_test, nms_d_test(), info = info_str)
tests/testthat/helpers/testers.R:1264: expect_identical(vs$d_test$type, dtest_type, info = info_str)
tests/testthat/helpers/testers.R:1265: expect_null(vs$d_test$data, info = info_str)
tests/testthat/helpers/testers.R:1266: expect_identical(vs$d_test$offset, vs$refmodel$offset, info = info_str)
tests/testthat/helpers/testers.R:1267: expect_identical(vs$d_test$weights, vs$refmodel$wobs, info = info_str)
tests/testthat/helpers/testers.R:1268: expect_identical(vs$d_test$y, vs$refmodel$y, info = info_str)
tests/testthat/helpers/testers.R:1270: expect_identical(vs$d_test, dtest_expected, info = info_str)
tests/testthat/helpers/testers.R:1492: expect_identical(smmry$nobs_test, nrow(vsel_expected$d_test$data),
tests/testthat/setup.R:1254: "refmodel", "search_path", "d_test", "summaries", "solution_terms", "kl",
tests/testthat/setup.R:1260: "refmodel", "search_path", "d_test", "summaries", "kl", "solution_terms",
tests/testthat/test_varsel.R:76:## d_test -----------------------------------------------------------------
tests/testthat/test_varsel.R:79: "`d_test` set to the training data gives the same results as its default"
tests/testthat/test_varsel.R:102: d_test_crr <- list(
tests/testthat
Possibly also the issue reported here? https://twitter.com/LisaDeBruine/status/1572520018797297664
One interesting thing to note: it seems like the output is cut off at the end? E.g.
tests/testthat/test_varsel.R:102: d_test_crr <- list(
tests/testthat
Perhaps we're losing some output from GNU grep for some reason?
I was able to reproduce something similar locally, with output stopping with these lines:
tests/testthat/helpers/testers.R:1492: expect_identical(smmry$nobs_test, nrow(vsel_expected$d_test$data),
tests/testthat/setup.R:1254: "refmodel", "search_path", "d_test", "summaries", "solution_terms", "kl",
tests/testthat
stderr: /gnugrep/3.0/grep: .Rproj.user/1901417D/sources/session-ADC58894/lock_file: Device or resource busy
Not sure if the stderr
output is a red herring or not.
The frustrating part is that the error seems to go away after restarting RStudio :-/
Note for QA: I was able to reproduce following the instructions in the OP (https://github.com/rstudio/rstudio/issues/11736#issue-1334717833); however, at least in my case, the issue seems to reproduce on the first time the project is opened; if you close RStudio and re-open the project, the issue might go away.
For that reason, when reproducing, I recommend testing with a "fresh" copy of the folder unpacked from https://github.com/fweber144/projpred/archive/ddaf3e976c871e083d22d0381f82d3d2414abeef.zip.
Also, to the best of my knowledge, this issue should predominantly affect Windows, but in theory other platforms will be affected as well (and I'm planning a separate PR for that).
Verified on 2022.11.0-daily+215 Windows 11
Tested with OP example, works as expected. Used a fresh copy of the .zip file contents.
Indeed, I can confirm that the issue does not occur anymore with RStudio 2022.11.0-daily+215 (tested on the Windows 11 machine mentioned above). Thanks a lot to all of you!
I'm putting this back into testing as I merged a separate PR for POSIX that does the same thing (ensure we read all stdout / stderr on process exit).
Given that this affects how we read output from any child process we launch, I think this code should be well-exercised by our existing test suite (e.g. anything that uses Quarto would run through this code) so I think our existing automation in that space would suffice for testing.
@jonvanausdeln, do you have any feelings on whether additional testing is warranted for this PR?
Did a quick verify on other all platforms, so I think it's good to go now.
System details
Steps to reproduce the problem
projpred-ddaf3e976c871e083d22d0381f82d3d2414abeef
.projpred.Rproj
.Ctrl + Shift + F
for launching the "Find in Files" dialog.d_test
into the input field under "Find:".Now, on my machine, the "Find in Files" tab (in the console pane) stops (at its bottom) at line 102 of file
tests/testthat/test_varsel.R
. However, if you open that file and search ford_test
(only in that file, using the smallerCtrl + F
search bar), then you'll quickly see that there are more occurrences, e.g., in line 109.Describe the problem in detail
The "Find in Files" search doesn't find all occurrences.
Describe the behavior you expected
I would have expected all occurrences to be listed in the "Find in Files" results of the console pane, in particular, the occurrence of
d_test
in line 109 of filetests/testthat/test_varsel.R
. If the results were limited by a maximum number of displayed occurrences (I think 1000 is the maximum), I would have expected a red line at the bottom of the "Find in Files" results saying that there were more occurrences than those that are shown.