Closed kevinrue closed 2 years ago
Hold on before consider a merge. I want to add an option to toggle this functionality on/off, as we probably want to detect (and complain) about "renv" files being pushed to
I'll ping back when it's fully ready. The current version is functional in that it always ignores "renv" files.
It looks ready to me now. Happy to hear feedback.
Example usages and outputs:
> BiocCheck::BiocCheck()
This is BiocCheck version 1.25.14. BiocCheck is a work in progress.
Output and severity of issues may change. Installing package...
* Checking for renv files
* ERROR: Detected project library files generated by `renv`. On
your computer, you can use `BiocCheck(`no-check-renv`=TRUE)`. If
this error appears on the Bioconductor Build System, use `git rm
--cached` to stop tracking the following directories and files:
/Users/kevin/github/kevinrue/BiocPackageSofwareTemplate/renv
/Users/kevin/github/kevinrue/BiocPackageSofwareTemplate/.Rprofile
/Users/kevin/github/kevinrue/BiocPackageSofwareTemplate/renv.lock
> BiocCheck(`no-check-renv` = TRUE)
[... the usual ...]
Note that renv
files are always ignored during the following other checks, as those checks try to parse R / Rmd / ... files also within the renv
subdirectory, causing BiocCheck to crash (see #106):
checkLogicalUseFiles
checkSystemCall
I'm not really in favor of this pull request as it stands.
A few things :
I'm curious as to why
this chokes on the coding practices section of code. The coding practice should be limited to the files in the R, man and vignette directory. The renv should not be included but it seems to have gotten the subdirectory. This would be a simple more elegant solution to fixing the initial issue with renv. renv I would imagine is not in or writing anything to those directories? so debugging to not have the subdirectory checked would be the first suggested change.
The renv files and functionality should be .gitignored
or utilized on clean branches like we recommend for other additional outside functionality (github actions, pagedown files, etc) . I could see them being problematic if they ended up on a users system, no? wouldn't this be specific to an individual not portable to other systems?
The .gitignore
will be especially important regarding the large file size. You added a section that skips renv for file size; which itself is fine for getting over the check however still problematic for Bioconductor. We impose a 5MB file size because of git requirements not as an arbitrary file size. Users utilizing this and not .gitignore
renv will then still have issues when adding the repository officially to the Bioconductor server. As it stands any package that has files over 5MB files requires exceptions and additional help/process of a core team member -- and will likely cause issues for pushing to the git server for maintenance updates.
Hi @lshep
I agree that this PR should be considered a draft, and I'm happy to edit as needed, following a discussion of what standard BiocCheck should ultimately enforce.
Starting with this point:
I'm curious as to why this chokes on the coding practices section of code.
The coding practices checks are given the package root directory, which lets some checks parse all subdirectories. Other checks are explicitly checking specific subdirectories (e.g. checkClassEqUsage
checks only R/
, vignettes/
subdirectories).
Conversely, for instance, checkLogicalUseFiles
looks for any R script in any subdirectory under the package root directory. Clearly, this is intended to capture R source code, as well as unit tests, and any other R script stored anywhere in the package, e.g. in the inst
subdirectory.
While it is technically possible to run separate commands to search for R script in an explicit list of target locations (R/
, tests/
, inst/
), this is higher maintenance and would be blind to unexpected locations where users may save R scripts. Instead, I find my approach more explicit, if not more elegant: search everywhere in the package, and then remove all matches that start with renv/
, which can be generalised to exclude more subdirectories if needed. Fair enough, the trawling does extra work looking in locations that will be later excluded, but ultimately, this has exactly the intended effect that you described:
The renv should not be included
From a more technical perspective, BiocCheck is choking because of some files in the renv/
subdirectory, that do not end with a newline character, see https://stat.ethz.ch/pipermail/r-help/2011-December/298222.html
Then,
renv I would imagine is not in or writing anything to those directories
Correct. renv
is only writing:
.Rprofile
and renv.lock
at the root of the directoryrenv
are within the renv/
subdirectorySee https://rstudio.github.io/renv/articles/renv.html
Next,
so debugging to not have the subdirectory checked would be the first suggested change.
Not having the subdirectory checked is what this PR is about, at least on the developer's computer.
Then not having the renv
subdirectory and files on the BBS is exactly the motivation for the ERROR message that I designed, specifically the part saying
If this error appears on the Bioconductor Build System, use `git rm --cached` to stop tracking the following directories and files:
I tried to keep the message reasonably short to fit in the BiocCheck report, but I'm open to rephrasing to make the ERROR clearer or more explicit.
In particular, I agree that git rm --cached
is only a good first step to get those files out of the BBS if they were pushed by mistake (the original motivation for this PR), while adding those files to .gitignore
is a more permanent solution to protect future commits.
So the ERROR message could be updated to something like:
* ERROR: Detected project library files generated by `renv`. On
your computer, you can use `BiocCheck(`no-check-renv`=TRUE)`. If
this error appears on the Bioconductor Build System, use `git rm
--cached` to stop tracking the following directories and files:
/Users/kevin/github/kevinrue/BiocPackageSofwareTemplate/renv
/Users/kevin/github/kevinrue/BiocPackageSofwareTemplate/.Rprofile
/Users/kevin/github/kevinrue/BiocPackageSofwareTemplate/renv.lock
then add the following entries to the package `.gitignore` file:
.Rprofile
renv
renv.lock
Next,
I could see them being problematic if they ended up on a users system, no? wouldn't this be specific to an individual not portable to other systems?
Absolutely agreed.
Again, the whole point of this PR is to throw an ERROR if renv
files ever make it into the Bioconductor git, while allowing developers to use an renv
project library when working on their package locally (i.e. having renv
files in their local package directory, but ignored by git
).
Finally,
The .gitignore will be especially important regarding the large file size. You added a section that skips renv for file size; which itself is fine for getting over the check however still problematic for Bioconductor.
Again, the renv
files should never make it to the Bioconductor repository. So those files should never contribute to the package size, or the individual file limit, on the BBS.
I thought that it would be OK to ignore renv
files during the individual file size checks, since I have already added an explicit ERROR if renv
files are detected at all. It seemed redundant to have both.
The "package total size" check still considers everything within the package directory, which would include renv
if present.
Perhaps I can re-include "renv" files in the individual file size checks, so that the total package size and the individual file checks are working on the same set of files.
That said, size checks should still excluded "renv" files on local developer computers, as "renv" files can be expected there, even if they are gitignored.
Let me know what you think!
Using version from commit f653c9f5742b56f6345fd465827e6642510333d0 Using https://github.com/kevinrue/BiocPackageSofwareTemplate to run BiocCheck (another work in progress).
Full check (intended for the BBS):
.gitignore
system()
calls> BiocCheck::BiocCheck()
This is BiocCheck version 1.25.14. BiocCheck is a work in progress. Output and severity of
issues may change. Installing package...
* Checking for renv files
* ERROR: Detected project library files generated by `renv`. On your computer, you can
use `BiocCheck(`no-check-renv`=TRUE)`. If this error appears on the Bioconductor Build
System, use `git rm --cached` to stop tracking the following directories and files in
the Bioconductor repository:
/Users/kevin/github/kevinrue/BiocPackageSofwareTemplate/renv
/Users/kevin/github/kevinrue/BiocPackageSofwareTemplate/.Rprofile
/Users/kevin/github/kevinrue/BiocPackageSofwareTemplate/renv.lock
Then add the following entries to the package `.gitignore` file:
.Rprofile
renv
renv.lock
* Checking Package Dependencies...
* Checking if other packages can import this one...
* Checking to see if we understand object initialization...
* Checking for deprecated package usage...
* Checking for remote package usage...
* Checking version number...
* Checking version number validity...
Package version 0.0.1; pre-release
* Checking R Version dependency...
* Checking package size...
Skipped... only checked on source tarball
* Checking individual file sizes...
* WARNING: The following files are over 5MB in size:
'renv/library/R-4.0/x86_64-apple-darwin17.0/stringi/libs/icudt61l.dat
renv/library/R-4.0/x86_64-apple-darwin17.0/stringi/libs/stringi.so'
* Checking biocViews...
* Checking that biocViews are present...
* ERROR: No biocViews terms found.
See http://bioconductor.org/developers/how-to/biocViews/
* Checking build system compatibility...
* Checking for blank lines in DESCRIPTION...
* Checking if DESCRIPTION is well formatted...
* Checking for proper Description: field...
* NOTE: The Description field in the DESCRIPTION is made up by less than 3 sentences.
Please consider expanding this field, and structure it as a full paragraph
* Checking for whitespace in DESCRIPTION field names...
* Checking that Package field matches directory/tarball name...
* Checking for Version field...
* Checking for valid maintainer...
* Checking DESCRIPTION/NAMESPACE consistency...
* Checking vignette directory...
This is an unknown type of package
* ERROR: No 'vignettes' directory.
* Checking library calls...
* Checking for library/require of BiocPackageSofwareTemplate...
* Checking coding practice...
grep: /Users/kevin/github/kevinrue/BiocPackageSofwareTemplate/R/*: No such file or directory
* Checking parsed R code in R directory, examples, vignettes...
* Checking function lengths
* Checking man page documentation...
* Checking package NEWS...
* NOTE: Consider adding a NEWS file, so your package news will be included in
Bioconductor release announcements.
* Checking unit tests...
* NOTE: Consider adding unit tests. We strongly encourage them. See
http://bioconductor.org/developers/how-to/unitTesting-guidelines/.
* Checking skip_on_bioc() in tests...
* Checking formatting of DESCRIPTION, NAMESPACE, man pages, R source, and vignette source...
* Checking if package already exists in CRAN...
* Checking for bioc-devel mailing list subscription...
* NOTE: Cannot determine whether maintainer is subscribed to the bioc-devel mailing list
(requires admin credentials). Subscribe here:
https://stat.ethz.ch/mailman/listinfo/bioc-devel
* Checking for support site registration...
Maintainer is registered at support site.
Summary:
ERROR count: 3
WARNING count: 1
NOTE count: 4
For detailed information about these checks, see the BiocCheck vignette, available at
https://bioconductor.org/packages/3.11/bioc/vignettes/BiocCheck/inst/doc/BiocCheck.html#interpreting-bioccheck-output
BiocCheck FAILED.
$error
[1] "Detected project library files generated by `renv`. On your computer, you can use `BiocCheck(`no-check-renv`=TRUE)`. If this error appears on the Bioconductor Build System, use `git rm --cached` to stop tracking the following directories and files in the Bioconductor repository:"
[2] "No biocViews terms found."
[3] "No 'vignettes' directory."
$warning
[1] "The following files are over 5MB in size: 'renv/library/R-4.0/x86_64-apple-darwin17.0/stringi/libs/icudt61l.dat renv/library/R-4.0/x86_64-apple-darwin17.0/stringi/libs/stringi.so'"
$note
[1] "The Description field in the DESCRIPTION is made up by less than 3 sentences. Please consider\nexpanding this field, and structure it as a full paragraph"
[2] "Consider adding a NEWS file, so your package news will be included in Bioconductor release announcements."
[3] "Consider adding unit tests. We strongly encourage them. See\n http://bioconductor.org/developers/how-to/unitTesting-guidelines/."
[4] "Cannot determine whether maintainer is subscribed to the bioc-devel mailing list (requires\nadmin credentials). Subscribe here: https://stat.ethz.ch/mailman/listinfo/bioc-devel"
Check ignoring "renv" files (intended for developer's computer):
system()
calls> BiocCheck::BiocCheck(`no-check-renv`=TRUE)
This is BiocCheck version 1.25.14. BiocCheck is a work in progress. Output and severity of
issues may change. Installing package...
* Checking Package Dependencies...
* Checking if other packages can import this one...
* Checking to see if we understand object initialization...
* Checking for deprecated package usage...
* Checking for remote package usage...
* Checking version number...
* Checking version number validity...
Package version 0.0.1; pre-release
* Checking R Version dependency...
* Checking package size...
Skipped... only checked on source tarball
* Checking individual file sizes...
* Checking biocViews...
* Checking that biocViews are present...
* ERROR: No biocViews terms found.
See http://bioconductor.org/developers/how-to/biocViews/
* Checking build system compatibility...
* Checking for blank lines in DESCRIPTION...
* Checking if DESCRIPTION is well formatted...
* Checking for proper Description: field...
* NOTE: The Description field in the DESCRIPTION is made up by less than 3 sentences.
Please consider expanding this field, and structure it as a full paragraph
* Checking for whitespace in DESCRIPTION field names...
* Checking that Package field matches directory/tarball name...
* Checking for Version field...
* Checking for valid maintainer...
* Checking DESCRIPTION/NAMESPACE consistency...
* Checking vignette directory...
This is an unknown type of package
* ERROR: No 'vignettes' directory.
* Checking library calls...
* Checking for library/require of BiocPackageSofwareTemplate...
* Checking coding practice...
grep: /Users/kevin/github/kevinrue/BiocPackageSofwareTemplate/R/*: No such file or directory
* Checking parsed R code in R directory, examples, vignettes...
* Checking function lengths
* Checking man page documentation...
* Checking package NEWS...
* NOTE: Consider adding a NEWS file, so your package news will be included in
Bioconductor release announcements.
* Checking unit tests...
* NOTE: Consider adding unit tests. We strongly encourage them. See
http://bioconductor.org/developers/how-to/unitTesting-guidelines/.
* Checking skip_on_bioc() in tests...
* Checking formatting of DESCRIPTION, NAMESPACE, man pages, R source, and vignette source...
* Checking if package already exists in CRAN...
* Checking for bioc-devel mailing list subscription...
* NOTE: Cannot determine whether maintainer is subscribed to the bioc-devel mailing list
(requires admin credentials). Subscribe here:
https://stat.ethz.ch/mailman/listinfo/bioc-devel
* Checking for support site registration...
Maintainer is registered at support site.
Summary:
ERROR count: 2
WARNING count: 0
NOTE count: 4
For detailed information about these checks, see the BiocCheck vignette, available at
https://bioconductor.org/packages/3.11/bioc/vignettes/BiocCheck/inst/doc/BiocCheck.html#interpreting-bioccheck-output
BiocCheck FAILED.
$error
[1] "No biocViews terms found." "No 'vignettes' directory."
$warning
character(0)
$note
[1] "The Description field in the DESCRIPTION is made up by less than 3 sentences. Please consider\nexpanding this field, and structure it as a full paragraph"
[2] "Consider adding a NEWS file, so your package news will be included in Bioconductor release announcements."
[3] "Consider adding unit tests. We strongly encourage them. See\n http://bioconductor.org/developers/how-to/unitTesting-guidelines/."
[4] "Cannot determine whether maintainer is subscribed to the bioc-devel mailing list (requires\nadmin credentials). Subscribe here: https://stat.ethz.ch/mailman/listinfo/bioc-devel"
A good place for more extended instructions is in the vignette, with the error message saying to look there for a more complete explanation.
The only coding practice check that doesn't check only R man or vignette seems to be the checkSystemCall which for consistency probably should be updated to the limited set of directories. If the checkSystemCall was updated for that it seems your issues would be resolved?
I agree then that maybe we should add a check for the renv directory or files but I think it should be in the location where we check for other system files and bad files? here
checkLogicalUseFiles()
also trawls the entire package directory pkgdir
, searching for files with a whole lot of extensions that can contain R code.
This was actually my original motivation for the PR, as this function ultimately sources all the R code in the package in the findLogicalFile()
function.
At the moment, this function does break down R scripts into those under "R/" and "other", which includes R scripts under "renv/".
But it also picks up all other extensions (manFiles, RNWFiles, RMDFiles) anywhere in the package, which again typically includes a number of Rmd files under "renv/").
This crashes BiocCheck, as some Rmd files under "renv" cannot be evaluated because they depend on a range of packages (e.g. emo
in my issue #106)
See https://github.com/Bioconductor/BiocCheck/blob/master/R/util.R#L721
My PR ignores all files under the "renv" subdirectory, to avoid sourcing code that is not part of the package.
Again, this is all only to allow BiocCheck to run on the developer's computer, where "renv" files can be expected, using the no-check-renv
=TRUE option
I agree that "renv" files should never be git tracked, (neither GitHub nor Bioconductor git), and that the vignette should educate developers about ignoring those files from the start, or untracking those files if they were added by accident.
I take it that's a "no" then :)
I also wonder what happened. A few words would have been nice.
Hi Kevin! @kevinrue
I was doing clean up of branches and deleted the master
branch (in favor of the devel
branch).
I guess that automatically closes the PR. Sorry!
The BiocCheck
codebase has changed significantly since your PR. It will likely not merge without
conflicts.
Regarding the aims of the PR, I'm open to implementing checking of stray renv
files in an
existing check.
There are about two packages that have an renv.lock
file in them :
https://code.bioconductor.org/search/search?q=f%3Arenv%5C.lock%24
From what I gather, the issue is more related to BiocCheck
not having a consistent way of checking files
and folders so that when renv
is used, it crashes. It seems to me that updating that mechanism is the way
to go rather than making exceptions in further code.
name: Ignore renv files and folders during checks about: Submit a pull request that resolves an BiocCheck issue #106 title: "[PR] Your pull request" labels: '' assignees: ''
If you are submitting a pull request to
BiocCheck
please follow the instructions outlined in this presentation. This presentation includes steps for forking, creating a branch for you to work on, and useful related information.Prior to sending the pull request, please verify that
R CMD build
andR CMD check
run without warnings or errors on the latest Bioconductor-devel (currently in May 2020 that would be Bioconductor 3.12) and complete the following steps:The easiest way to obtain a working environment with Bioconductor-devel in your computer is to use the Bioconductor devel docker image as described in detail in the Bioconductor website.
For more questions, please get in touch with the Bioconductor core team through the Bioconductor Slack workspace.