Closed bschilder closed 10 months ago
Hi @bschilder thanks for your work on this.
I don't think Rproj files belong in git -- these are user preferences rather than a part of the package; after all I don't include .emacs ... Also for instance if I read this correctly something like NumSpacesForTab: 2
is not consistent with the 4-space indentation in the package (and more generally bioconductor core?) programming style, the package doesn't use devtools paradigms, etc. So I'm not sure that this is adding value.
I'm concerned about the support costs of rworkflows, which I am not familiar with and which would fall on maintainer shoulders if your interests were to migrate elsewhere (the first commit to Rsamtools in 2009...).
I'm also not confident that the workflow is 'doing the right thing' and will continue to do the right thing, with respect to versioning. For instance the Linux distribution appears to use R-devel and bioconductor/bioconductor_docker:devel, which is correct for the current and next Bioconductor release cycle, but will be incorrect for the cycle after that (bioconductor has a twice yearly release cycle whereas R has an annual cycle; bioconductor devel anticipates the user environment when it becomes release, so bioc-devel only builds on R-devel 1/2 the time). Likewise macOS and Windows seem to be using R 'latest' but bioc-devel, which is not how Bioconductor distributes packages (or perhaps 'latest' is a synonym for R-devel, but then this just highlights the implicit support costs...).
I see also 'as_cran' but Bioconductor does not check packages that way, using instead the R environment defined at http://bioconductor.org/checkResults/release/bioc-LATEST/Renviron.bioc.
The issue
❯ checking foreign function calls ... NOTE
Registration problems:
symbol ‘func’ not in namespace:
.Call(func, .extptr(file), regions, flag, simpleCigar, tagFilter,
mapqFilter, ...)
seems to be incorrect -- the pattern is foo <- function(func, ...) ... .Call(func, ...)
. The other issues were already known and / or rationalized from the Bioconductor build report.
Thanks for all the thoughtful comments @mtmorgan !
Hi @bschilder thanks for your work on this.
I don't think Rproj files belong in git -- these are user preferences rather than a part of the package; after all I don't include .emacs ... Also for instance if I read this correctly something like
NumSpacesForTab: 2
is not consistent with the 4-space indentation in the package (and more generally bioconductor core?) programming style, the package doesn't use devtools paradigms, etc. So I'm not sure that this is adding value.
I agree, this was added by mistake since I assumed you already had those file types added to your .gitignore. I can fix this.
I'm concerned about the support costs of rworkflows, which I am not familiar with and which would fall on maintainer shoulders if your interests were to migrate elsewhere (the first commit to Rsamtools in 2009...).
True, but that's kind of the point of rworkflows
. Rather than having a static workflow that each user must maintain, it calls a centralized action that our Neurogenomics Lab maintains. To me, this is even less risky than having a Suggest dependency (with a lot of benefits), as it does not affect whether your package gets distributed to Bioc, it is simply an intermediate check for developers to get immediate feedback on whether they're code is working, rather than waiting several days to see whether things have worked on Bioc (and at that point, any bugs are now being distributed to Bioc users).
I do understand the fear, but the same is true for any package or software that is maintained on a volunteer basis, including Rsamtools
itself!
If it's any consolation, I currently maintain >25 R packages that all depend on rworkflows
. So I have pretty good incentive to keep maintaining it ;) Plus, the goal is to grow the popularity of rworkflows
such that it benefits from a lot of community involvement.
I'm also not confident that the workflow is 'doing the right thing' and will continue to do the right thing, with respect to versioning. For instance the Linux distribution appears to use R-devel and bioconductor/bioconductor_docker:devel, which is correct for the current and next Bioconductor release cycle, but will be incorrect for the cycle after that (bioconductor has a twice yearly release cycle whereas R has an annual cycle; bioconductor devel anticipates the user environment when it becomes release, so bioc-devel only builds on R-devel 1/2 the time). Likewise macOS and Windows seem to be using R 'latest' but bioc-devel, which is not how Bioconductor distributes packages.
I'm open to suggestions regarding the 'right thing' to do!
As a developer of a handful of Bioc packages myself, I'm very much interested in making rworkflows
fully compatible with the Bioc schedule.
My logic with having Linux use the devel version was that it would allow developers to be ahead of the curve, whereas Mac/Windows reflects the latest release.
These are of course all things you have full control over within your yaml file. I've done this step for you, but the R function rworkflows::use_workflow()
has a bunch of arguments you can adjust to generate the workflow script with whatever your preferences are. That said, I'd like to programmatically extract the 'correct' versions for R and Bioc such that they match the Bioc release schedule, so that users never have to worry about that. I've added it as an Issue here:
https://github.com/neurogenomics/rworkflows/issues/33
I see also 'as_cran' but Bioconductor does not check packages that way, using instead the R environment defined at http://bioconductor.org/checkResults/release/bioc-LATEST/Renviron.bioc.
You're welcome to change any of the rworkflows
parameters as you see fit. To change thing, simply set as_cran: {{ false }}
in the yaml file. I'd also be happy to change this for you.
The issue
❯ checking foreign function calls ... NOTE Registration problems: symbol ‘func’ not in namespace: .Call(func, .extptr(file), regions, flag, simpleCigar, tagFilter, mapqFilter, ...)
seems to be incorrect -- the pattern is
foo <- function(func, ...) ... .Call(func, ...)
. The other issues were already known and / or rationalized from the Bioconductor build report.
I see, perhaps there's a setting in rcmdcheck
I can adjust. Still, it's just a Note so it shouldn't affect the ability to rworkflows
to pass.
My larger concern were the Warning messages. I'm not familiar with C code myself, but perhaps you'll have some insights into this.
❯ checking compiled code ... WARNING
File ‘Rsamtools/libs/Rsamtools.so’:
Found ‘___stderrp’, possibly from ‘stderr’ (C)
Objects: ‘bam_sort.o’, ‘bamfile.o’, ‘bedidx.o’, ‘sam_opts.o’,
‘sam_utils.o’, ‘samtools_patch.o’
Found ‘___stdoutp’, possibly from ‘stdout’ (C)
Objects: ‘bam_sort.o’, ‘sam_utils.o’
Found ‘_abort’, possibly from ‘abort’ (C)
Object: ‘bam_sort.o’
File ‘Rsamtools/libs/Rsamtools.so’:
Found no call to: ‘R_useDynamicSymbols’
Thanks again, Brian
Hi @mtmorgan, I've gone through and implemented many of your helpful suggestions.
as_cran
to false in the rworkflows
workflow file within GenomicRanges
.I hope this helps alleviate at least some of the main concerns! Let me know if there's anything else I can do to help.
rcmdcheck
Regarding these warnings, trying to find which parameters i need to change so that these are ignored (if you can confirm they are spurious warnings):
❯ checking compiled code ... WARNING
File ‘Rsamtools/libs/Rsamtools.so’:
Found ‘___stderrp’, possibly from ‘stderr’ (C)
Objects: ‘bam_sort.o’, ‘bamfile.o’, ‘bedidx.o’, ‘sam_opts.o’,
‘sam_utils.o’, ‘samtools_patch.o’
Found ‘___stdoutp’, possibly from ‘stdout’ (C)
Objects: ‘bam_sort.o’, ‘sam_utils.o’
Found ‘_abort’, possibly from ‘abort’ (C)
Object: ‘bam_sort.o’
File ‘Rsamtools/libs/Rsamtools.so’:
Found no call to: ‘R_useDynamicSymbols’
Compiled code should not call entry points which might terminate R nor
write to stdout/stderr instead of to the console, nor use Fortran I/O
nor system RNGs.
It is good practice to register native routines and to disable symbol
search.
See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual.
Potentially related Issues: https://github.com/airoldilab/sgd/issues/44 https://github.com/ropensci/git2r/issues/83 https://github.com/microsoft/LightGBM/issues/1909
I tried these suggested fixes but tools::package_native_routine_registration_skeleton(".")
simply yielded a message "no native symbols were extracted". And it looks like the NAMESPACE file already has useDynLib(Rsamtools, .registration=TRUE)
as suggested in the post.
https://stackoverflow.com/questions/42313373/r-cmd-check-note-found-no-calls-to-r-registerroutines-r-usedynamicsymbols
One way to get around this is simply skipping the rcmdcheck
step, and instead rely only on BiocCheck
to catch problems.
BiocCheck
BiocCheck
issues several errors and warnings .
> BiocCheck::BiocCheck()
─ BiocCheckVersion: 1.34.2
─ BiocVersion: 3.16
─ Package: Rsamtools
─ PackageVersion: 2.16.1
─ sourceDir: /Users/schilder/Desktop/forks/Rsamtools
─ installDir: /var/folders/zq/h7mtybc533b1qzkys_ttgpth0000gn/T//RtmplVBUQa/file1794b3d48fbd8
─ BiocCheckDir: /Users/schilder/Desktop/forks/Rsamtools.BiocCheck
─ platform: unix
─ isTarBall: FALSE
* Installing package...
* Checking for deprecated package usage...
* Checking for remote package usage...
* Checking for 'LazyData: true' usage...
* Checking version number...
* Checking version number validity...
* Checking R version dependency...
* NOTE: Update R version dependency from 3.5.0 to 4.2.0.
* Checking package size...
Skipped... only checked on source tarball
* Checking individual file sizes...
* WARNING: The following files are over 5MB in size:
'.git/objects/pack/pack-6e2cbf0bd241b8744672adb07f33834ac8daa212.pack'
* Checking biocViews...
* Checking that biocViews are present...
* Checking package type based on biocViews...
Software
* Checking for non-trivial biocViews...
* Checking that biocViews come from the same category...
* Checking biocViews validity...
* Checking for recommended biocViews...
* NOTE: Consider adding these automatically suggested biocViews: ATACSeq,
SequenceMatching, VariantAnnotation, MultipleSequenceAlignment
Search 'biocViews' at https://contributions.bioconductor.org
* Checking build system compatibility...
* Checking for blank lines in DESCRIPTION...
* Checking if DESCRIPTION is well formatted...
* Checking for proper Description: field...
* NOTE: The Description field in the DESCRIPTION is made up by less than 3
sentences. Please consider expanding this field, and structure it as a
full paragraph
* Checking for whitespace in DESCRIPTION field names...
* Checking that Package field matches directory/tarball name...
* Checking for Version field...
* Checking for valid maintainer...
* ERROR: Do not use Author/Maintainer fields. Use Authors@R.
* ERROR: Remove Maintainer field. Use Authors@R [cre] designation.
* Checking License: for restrictive use...
* Checking for pinned package versions...
* Checking DESCRIPTION/NAMESPACE consistency...
* Checking .Rbuildignore...
* Checking for stray BiocCheck output folders...
* Checking vignette directory...
Warning in remind_sweave(if (in.file) input, sweave_lines) :
It seems you are using the Sweave-specific syntax in line(s) 9, 105, 355, 447, 504; you may need Sweave2knitr("/Users/schilder/Desktop/forks/Rsamtools/vignettes//Rsamtools-Overview.Rnw") to convert it to knitr
Warning in remind_sweave(if (in.file) input, sweave_lines) :
It seems you are using the Sweave-specific syntax in line(s) 9, 105, 355, 447, 504; you may need Sweave2knitr("/Users/schilder/Desktop/forks/Rsamtools/vignettes//Rsamtools-Overview.Rnw") to convert it to knitr
Warning in remind_sweave(if (in.file) input, sweave_lines) :
It seems you are using the Sweave-specific syntax in line(s) 9, 105, 355, 447, 504; you may need Sweave2knitr("/Users/schilder/Desktop/forks/Rsamtools/vignettes//Rsamtools-Overview.Rnw") to convert it to knitr
Warning in remind_sweave(if (in.file) input, sweave_lines) :
It seems you are using the Sweave-specific syntax in line(s) 9, 105, 355, 447, 504; you may need Sweave2knitr("/Users/schilder/Desktop/forks/Rsamtools/vignettes//Rsamtools-Overview.Rnw") to convert it to knitr
Warning in remind_sweave(if (in.file) input, sweave_lines) :
It seems you are using the Sweave-specific syntax in line(s) 9, 105, 355, 447, 504; you may need Sweave2knitr("/Users/schilder/Desktop/forks/Rsamtools/vignettes//Rsamtools-Overview.Rnw") to convert it to knitr
* Checking package installation calls in R code...
* Checking for library/require of Rsamtools...
* Checking coding practice...
* NOTE: Avoid sapply(); use vapply()
* NOTE: Avoid the use of 'paste' in condition signals
* Checking parsed R code in R directory, examples, vignettes...
* NOTE: Avoid '<<-' if possible (found 2 times)
* Checking function lengths...
* NOTE: The recommended function length is 50 lines or less. There are 411
functions greater than 50 lines.
* Checking man page documentation...
* WARNING: Add non-empty \value sections to the following man pages:
* ERROR: At least 80% of man pages documenting exported objects must have
runnable examples.
* NOTE: Usage of dontrun{} / donttest{} tags found in man page examples. 11%
of man pages use at least one of these tags.
* NOTE: Use donttest{} instead of dontrun{}.
* Checking package NEWS...
* Checking unit tests...
* Checking skip_on_bioc() in tests...
* Checking formatting of DESCRIPTION, NAMESPACE, man pages, R source, and
vignette source...
* NOTE: Consider shorter lines; 21 lines (0%) are > 80 characters long.
* NOTE: Consider 4 spaces instead of tabs; 70 lines (1%) contain tabs.
* NOTE: Consider multiples of 4 spaces for line indents; 1633 lines (18%)
are not.
See https://contributions.bioconductor.org/r-code.html
See styler package: https://cran.r-project.org/package=styler as described
in the BiocCheck vignette.
* Checking if package already exists in CRAN...
'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for
details
replacement repositories:
CRAN: https://cran.rstudio.com/
* Checking for bioc-devel mailing list subscription...
* NOTE: Cannot determine whether maintainer is subscribed to the Bioc-Devel
mailing list (requires admin credentials). Subscribe here:
https://stat.ethz.ch/mailman/listinfo/bioc-devel
* Checking for support site registration...
Maintainer email is ok.
─ BiocCheck results ──
1 ERRORS | 2 WARNINGS | 14 NOTES
See the Rsamtools.BiocCheck folder and run
browseVignettes(package = 'BiocCheck')
for details.
I've gone ahead and reformatted these so that they're compliant with the latest release of BiocCheck
, by collapsing them into one "Authors@R" field.
I'll leave this one to you @mtmorgan so you can decide on what the examples should be, and for which functions they're included in.
man/ApplyPileupsParam-class.Rd, man/BamFile-class.Rd, man/BamViews-class.Rd, man/BcfFile-class.Rd, man/FaFile-class.Rd, man/PileupFiles-class.Rd, man/RsamtoolsFile-class.Rd, man/RsamtoolsFileList-class.Rd, man/ScanBamParam-class.Rd, man/ScanBcfParam-class.Rd, man/TabixFile-class.Rd
I'll also leave this to you as you know these functions best.
Hope this helps!, Brian
Hi @mtmorgan, just wanted to check in and see whether you might want to merge this PR.
To recap:
BiocCheck
has indicated there is 1 Error and 2 Warnings. These can be addressed after merge. Or the setting in BiocCheck
can be changed if you disagree with the error/warnings.rworkflows
, that's done entirely on the rworkflows
developer's side (i.e. me and my lab). Hope that helps clear things up, happy to answer any questions!
I'll leave this for @hpages or @vjcitn to resolve.
We will look at it.
Hi @bschilder,
A few minor things that caught my attention:
Rsamtools-Overview.Rnw
) which got removed from the package in April 2023.NEWS
says CHANGES IN VERSION 2.16.1
but I don't see a version bump, and the version of the package is 2.17.0 in devel
.https://github.com/Bioconductor/Rsamtools
to the URL
field in the DESCRIPTION file. The BugReports
field already points people to the issue tracker on GitHub. Bioconductor prefers to make the package landing page (https://bioconductor.org/packages/Rsamtools
) the primary URL, and not advertize the URL of the Github repo so much because this encourages people to install the package from there (which is bad) or to use the issue tracker to ask questions about how to use the software (which is also bad).Thanks!
Also how is the addition of a .gitignore
file related to the "rworkflows" functionality? This is distracting. If you want to make the case for adding this file please open an issue and/or submit a separate PR. We already discussed this here :wink:
In the same spirit, the new .Rbuildignore
file should only have the ^.github$
line. AFAIK the other lines are not related to the "rworkflows" functionality either.
Again, PRs should stick to the matter and not try to squeeze random/unrelated changes. Thanks!
Finally I'm not sure I understand the following lines:
run_bioccheck: ${{ true }}
run_rcmdcheck: ${{ false }}
Does it mean that the workflow will run BiocCheck
but not R CMD check
?
Same with the following lines:
has_testthat: ${{ true }}
has_runit: ${{ false }}
Shouldn't this be the otherway around?
- It looks like the PR re-introduces the Sweave vignette (
Rsamtools-Overview.Rnw
) which got removed from the package in April 2023.
Likely due to the lag between when I first made the PR (Dec 23, 2022) and now. I merged the latest version of Rsamtools into my fork but I guess some files were kept by accident. Easy fix, just removed it.
- The new entry in
NEWS
saysCHANGES IN VERSION 2.16.1
but I don't see a version bump,and the version of the package is 2.17.0 in
devel
.
Again due to the lag, and apparently the fact that the NEWS file isn't being regularly updated? In any case, just updated the version in both files.
- Please don't add
https://github.com/Bioconductor/Rsamtools
to theURL
field in the DESCRIPTION file. TheBugReports
field already points people to the issue tracker on GitHub. Bioconductor prefers to make the package landing page (https://bioconductor.org/packages/Rsamtools
) the primary URL, and not advertize the URL of the Github repo so much because this encourages people to install the package from there (which is bad) or to use the issue tracker to ask questions about how to use the software (which is also bad).
Removed.
Also how is the addition of a
.gitignore
file related to the "rworkflows" functionality? This is distracting. If you want to make the case for adding this file please open an issue and/or submit a separate PR. We already discussed this here 😉
Simply wanted to make sure I wasn't pushing any extra files that I didn't intend to (e.g macs are constantly generating .DS_Store files). But happy to remove this and remove any other residual files manually.
In the same spirit, the new
.Rbuildignore
file should only have the^.github$
line. AFAIK the other lines are not related to the "rworkflows" functionality either.
Not quite. These are directly related to rworkflows
, as the files I listed there can be generated in the course of GHA checks on certain platforms (namely Windows). I figured I'd add them to save you from scratching your head when things fail later.
Finally I'm not sure I understand the following lines:
run_bioccheck: ${{ true }} run_rcmdcheck: ${{ false }}
Does it mean that the workflow will run
BiocCheck
but notR CMD check
?
Yup, but you can change it to whatever configuration your prefer! Just let me know which you would like and I'd be happy to do it for you.
Same with the following lines:
has_testthat: ${{ true }} has_runit: ${{ false }}
Shouldn't this be the otherway around?
In your case yes, thanks for the catch! These are simply the defaults. Just fixed this.
That said, outside of this PR I'd highly encourage you to consider adding some more targeted unit tests, rather than relying solely on the generic BiocGenerics:::testPackage
function which doesn't cover most scenarios from my understanding. But perhaps something to tackle another day.
If you're unsure about any of the parameters, you can always read about each of them in more detail here, or use:
?rworkflows::use_workflow
I'm not convinced that these lines in .Rbuildignore
are related to the "rworkflows" functionality:
^.*\.Rproj$
^\.Rproj\.user$
^migration_notes.md$
so maybe at least those can be removed?
Not sure about the other ones:
node_modules$
package-lock\.json$
package\.json$
but I don't see them here so I'm confused!
I would suggest that you set run_rcmdcheck:
to ${{ true }}
in the PR. Isn't running R CMD check
the main purpose of "rworkflows"? Even better, the factory settings for "rworkflows" should focus on trying to reproduce the build/check results produced by the official daily builds, to avoid confusion for developers and users of the package. For this purpose I would also suggest that you set run_bioccheck:
to ${{ false }}
in the PR.
The potential for confusion is real and should not be underestimated. Just earlier this morning @vjcitn merged your other "rworkflows" PR for VariantAnnotation (https://github.com/Bioconductor/VariantAnnotation/pull/73), but then a couple hours later Vince had to revert the merge because check was failing, probably because the yaml file also had run_bioccheck:
set to ${{ true }}
.
As mentioned already here, BiocCheck has a tendency to produce some false positives which is why we don't run it as part of our daily builds yet.
About the NEWS
file. Some Bioconductor packages adopt the "one big section per Bioconductor release" approach. That means that all the changes that happened in the package between BioC 3.17 and BioC 3.18 are documented in a big section that starts with:
CHANGES IN VERSION X.Y
----------------------
where X.Y
is the major version that the package will have at the time of the release (2.18 in the case of Rsamtools). That big section is itself organized in subsections, typically NEW FEATURES
, SIGNIFICANT USER-VISIBLE CHANGE
, DEPRECATED AND DEFUNCT
, and BUG FIXES
.
That said, outside of this PR I'd highly encourage you to consider adding some more targeted unit tests, rather than relying solely on the generic
BiocGenerics:::testPackage
function which doesn't cover most scenarios from my understanding.
Not sure what you mean exactly. Calling BiocGenerics:::testPackage("MyPackage")
is how we run the unit tests in the RUnit world. So this would be like saying that test_check("MyPackage")
doesn't cover most scenarios in the testthat world. :thinking:
Why add an empty line far down in the NEWS
file? (just above the CHANGES IN VERSION 2.10
section). Again let's please avoid polluting the PR with random/unrelated changes. It's easy to spot this kind of pollution by checking the diff here: https://github.com/Bioconductor/Rsamtools/pull/48/files
Thanks
I'm not convinced that these lines in
.Rbuildignore
are related to the "rworkflows" functionality:^.*\.Rproj$ ^\.Rproj\.user$ ^migration_notes.md$
so maybe at least those can be removed?
Sure, removed.
Not sure about the other ones:
node_modules$ package-lock\.json$ package\.json$
but I don't see them here so I'm confused!
The issue is intermittent, haven't figured out the logic to it yet (partly due to how GHA sets up their VMs, I suspect). Can remove if you like, I'll leave it to you to fix if and and when the errors return.
I would suggest that you set
run_rcmdcheck:
to${{ true }}
in the PR. Isn't runningR CMD check
the main purpose of "rworkflows"? Even better, the factory settings for "rworkflows" should focus on trying to reproduce the build/check results produced by the official daily builds, to avoid confusion for developers and users of the package. For this purpose I would also suggest that you setrun_bioccheck:
to${{ false }}
in the PR.
Done.
The potential for confusion is real and should not be underestimated. Just earlier this morning @vjcitn merged your other "rworkflows" PR for VariantAnnotation (Bioconductor/VariantAnnotation#73), but then a couple hours later Vince had to revert the merge because check was failing, probably because the yaml file also had
run_bioccheck:
set to${{ true }}
. As mentioned already here, BiocCheck has a tendency to produce some false positives which is why we don't run it as part of our daily builds yet.
I don't have any issues with BiocCheck atm. But if you do, I recommend take it up with the BiocCheck team: https://github.com/Bioconductor/BiocCheck
About the
NEWS
file. Some Bioconductor packages adopt the "one big section per Bioconductor release" approach. That means that all the changes that happened in the package between BioC 3.17 and BioC 3.18 are documented in a big section that starts with:CHANGES IN VERSION X.Y ----------------------
where
X.Y
is the major version that the package will have at the time of the release (2.18 in the case of Rsamtools). That big section is itself organized in subsections, typicallyNEW FEATURES
,SIGNIFICANT USER-VISIBLE CHANGE
,DEPRECATED AND DEFUNCT
, andBUG FIXES
.
Up to you how you want to document things, just pointing out the reason why it occurred.
That said, outside of this PR I'd highly encourage you to consider adding some more targeted unit tests, rather than relying solely on the generic
BiocGenerics:::testPackage
function which doesn't cover most scenarios from my understanding.Not sure what you mean exactly. Calling
BiocGenerics:::testPackage("MyPackage")
is how we run the unit tests in the RUnit world. So this would be like saying thattest_check("MyPackage")
doesn't cover most scenarios in the testthat world. 🤔
We can discuss this in more detail elsewhere.
Why add an empty line far down in the
NEWS
file? (just above theCHANGES IN VERSION 2.10
section). Again let's please avoid polluting the PR with random/unrelated changes. It's easy to spot this kind of pollution by checking the diff here: https://github.com/Bioconductor/Rsamtools/pull/48/files
Done.
@hpages if you're interested in learning about the purpose of rworkflows
, I'd recommend you take a look at the documentation:
https://github.com/neurogenomics/rworkflows/issues
As well as the associated preprint: https://www.researchsquare.com/article/rs-2399015/v1
Thanks for all the changes and the links you provided above.
Why remove the .Rbuildignore
file? Don't we need to prevent the .github
folder to end up in the source tarball?
Up to you how you want to document things etc...
I don't maintain Rsamtools and didn't choose the style/format of its NEWS
file (even though I adopted that style in the packages I maintain). But it's usually on the authors of a PR to do their best effort to follow the style used in the piece of software they are contributing to. Hence the reason I took the time to give you feedback on how NEWS
should be formatted.
BTW Bioconductor has moved to using devel
instead of master
but I just spotted that the yaml file sill uses the latter. The other PRs you've submitted for other Bioconductor packages seem to have the same problem. Thx
Thanks for all the changes and the links you provided above.
Why remove the
.Rbuildignore
file? Don't we need to prevent the.github
folder to end up in the source tarball?
Sorry, I interpreted your previous messages as wanting to remove the .Rbuildignore
file. I'll add it back.
BTW Bioconductor has moved to using
devel
instead ofmaster
but I just spotted that the yaml file sill uses the latter. The other PRs you've submitted for other Bioconductor packages seem to have the same problem. Thx
Thanks for noticing this, thought I had added "devel" but i guess not. I've adjusted this one, and will do so for the others.
PS - It looks like when I forked the repo GitHub automatically must have renamed the branch "master". I'll keep an eye out for this in the future: https://github.com/neurogenomics/Rsamtools/tree/master
It looks like when I forked the repo GitHub automatically must have renamed the branch "master"
Or maybe you forked before we renamed the branch? The master
-> devel
renaming happened this year (see the various announcements on the bioc-devel mailing lists about that). However it seems that you forked before that.
Any chance you can amend your addition to the NEWS
file? All you need to do is replace
CHANGES IN VERSION 2.17.1
with
CHANGES IN VERSION 2.18
Finally can you please explain how the yaml file makes sure that the right version of Bioconductor is used for building and checking the package? I see that the use of the bioconductor/bioconductor_docker:devel
container is hardcoded in the yaml file. Note that this would only work for the devel
branch of the package, but not for the RELEASE_*
branches. What am I missing?
Thanks again, H.
@hpages
Can you elaborate on what false positives you are seeing in BiocCheck
? (ideally with reproducible examples)
Thanks!
@LiNk-NY Not really the place here to discuss this but it's actually a mix of:
Sys.setenv
),Authors@
).Thanks, I've moved the discussion to https://github.com/Bioconductor/BiocCheck/issues/197
@bschilder I wanted to mention in connection with this PR (and others) that our team has been discussing the proposal to adopt rworkflows for this and other packages. The current thinking is that an intensive multiparty discussion about specific GHA-oriented solutions for Bioconductor packages is needed. We will be in touch as this discussion is scheduled.
@bschilder I wanted to mention in connection with this PR (and others) that our team has been discussing the proposal to adopt rworkflows for this and other packages. The current thinking is that an intensive multiparty discussion about specific GHA-oriented solutions for Bioconductor packages is needed. We will be in touch as this discussion is scheduled.
@vjcitn I've actually been working on adding some new features to rworkflows to make it even more Bioc friendly. will share v soon! and thanks, looking forward to the discussion
It looks like when I forked the repo GitHub automatically must have renamed the branch "master"
Or maybe you forked before we renamed the branch? The
master
->devel
renaming happened this year (see the various announcements on the bioc-devel mailing lists about that). However it seems that you forked before that.
Ah you're right, that makes more sense actually.
Any chance you can amend your addition to the
NEWS
file? All you need to do is replaceCHANGES IN VERSION 2.17.1
with
CHANGES IN VERSION 2.18
Sure, I'm assuming you want me to bump the version in DESCRIPTION as well?
Finally can you please explain how the yaml file makes sure that the right version of Bioconductor is used for building and checking the package? I see that the use of the
bioconductor/bioconductor_docker:devel
container is hardcoded in the yaml file. Note that this would only work for thedevel
branch of the package, but not for theRELEASE_*
branches. What am I missing?
I'm not sure I would call thebioconductor/bioconductor_docker:devel
parameter "hard coded", in the same way I wouldn't call any of the other parameters in the workflow file hard coded (e.g. run_biocheck: {{ false }}
). These are all parameters which the user can change to fit their needs and preferences. So it's up the user which is the "right" version of bioconductor they wish to run CI on for each VM.
Let's take a closer look:
matrix:
config:
- os: ubuntu-latest
r: devel
bioc: devel
cont: bioconductor/bioconductor_docker:devel
rspm: https://packagemanager.rstudio.com/cran/__linux__/focal/release
- os: macOS-latest
r: latest
bioc: release
- os: windows-latest
r: latest
bioc: release
You'll notice that on the ubuntu-latest
VM the versions of r
/ bioc
/ cont
are all coordinated (ie. they use devel
). Similarly, on the VMs macOS-latest
and windows-latest
the versions of r
/ bioc
/ cont
are all coordinated (ie. they use latest
/release
).
Users can indeed change these parameters to suit their needs, but I provide a bit of extra help within the rworkflows::use_workflow
function via the rworkflows::construct_runners
helper subfunction. The assists users when selecting the VM runner options that make sense together, which is reflected in the example above.
Here is the relevant documentation: https://neurogenomics.github.io/rworkflows/reference/use_workflow.html https://neurogenomics.github.io/rworkflows/reference/construct_runners.html
Now, you might ask "Why not just pass a simplified parameter to specify the runners, instead of including the matrix:config
explicitly within the workflow?". Well there's a couple reason for this, the main one being that GHA only lets you specify the runner within workflow files, not action scripts. For this reason, I can't simply pass in a parameter named something like: use_runner: ubuntu-latest
.
If I understand correctly, there is a second point that you've made regarding the coordination between branches and VM runners.
So for example, if you make pushes to the RELEASE_3.17
branch, you may only want to run VMs using the release version of Bioc. And when making pushes to the 'devel' branch to may only want to run VMs using the devel version of Bioc. In this case, you would simply generate two separate workflow files like so:
(note i'm using the dev version of rworkflows here bc i just edited some of the args to make this even easier.)
remotes::install_github("neurogenomics/rworkflows@dev")
#### Write devel branch workflow ####
v <- "devel"
f1 <- use_workflow(name = paste("rworkflows",v,sep="."),
branches = v,
runners = construct_runners(bioc = v),
preview = TRUE,
force_new = TRUE,
save_dir = tempdir() # For demo only, use default in practice
)
#### Write RELEASE_3_17 branch workflow ####
v <- "RELEASE_3_17"
f2 <- use_workflow(name = paste("rworkflows",v,sep="."),
branches = v,
runners = construct_runners(bioc = v),
preview = TRUE,
force_new = TRUE,
save_dir = tempdir() # For demo only, use default in practice
)
The above code will generate two workflow files; one for the devel
branch and one for the RELEASE_3_17
branch. They will each use the respectively appropriate version of Bioc.
Quick question: How do you learn that the RSPM / PPM now uses jammy
instead of focal
? Is this done manually?
Currently, the Bioconductor devel container uses codename jammy
and the config should look like:
cont: bioconductor/bioconductor_docker:devel
rspm: https://packagemanager.rstudio.com/cran/__linux__/jammy/release
verified with:
docker run -it bioconductor/bioconductor_docker:devel bash
root@:/# lsb_release -a
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy
Quick question: How do you learn that the RSPM / PPM now uses
jammy
instead offocal
? Is this done manually?Currently, the Bioconductor devel container uses codename
jammy
and the config should look like:cont: bioconductor/bioconductor_docker:devel rspm: https://packagemanager.rstudio.com/cran/__linux__/jammy/release
verified with:
docker run -it bioconductor/bioconductor_docker:devel bash root@:/# lsb_release -a Distributor ID: Ubuntu Description: Ubuntu 22.04.3 LTS Release: 22.04 Codename: jammy
Oh interesting! I didn't realise this was an element that was liable to change. I can add a function to use_workflows
that infers the correct codename.
Made a dedicated Issue for it here: https://github.com/neurogenomics/rworkflows/issues/73
Sure, I'm assuming you want me to bump the version in DESCRIPTION as well?
Nope. You've already bumped it to 2.17.1, which is how it should be. (Remember that the core team will bump the package version to 2.18 at release time.)
So for example, if you make pushes to the RELEASE_3.17 branch, you may only want to run VMs using the release version of Bioc. And when making pushes to the 'devel' branch to may only want to run VMs using the devel version of Bioc.
Exactly. Using the right VM for the right branch is very important. It's important to realize that trying to build/check a specific branch (e.g. RELEASE_3_17
) of a Bioconductor package on the wrong container (e.g. bioconductor/bioconductor_docker:devel
) produces meaningless results. I was hoping that this would be somehow achievable via a single static workflow file, but, based on your explanations, it seems that one would need to edit the workflow file in the new branch after each release. This is a concern that I wish could be addressed.
In the meantime, since your PR is against the devel
branch, and the workflow file in this PR uses bioconductor/bioconductor_docker:devel
, I still don't understand why RELEASE_**
is listed under branches
:
push:
branches:
- devel
- RELEASE_**
pull_request:
branches:
- devel
- RELEASE_**
What am I missing?
Thanks, H.
Sure, I'm assuming you want me to bump the version in DESCRIPTION as well?
Nope. You've already bumped it to 2.17.1, which is how it should be. (Remember that the core team will bump the package version to 2.18 at release time.)
Ok, thanks for clarifying .
So for example, if you make pushes to the RELEASE_3.17 branch, you may only want to run VMs using the release version of Bioc. And when making pushes to the 'devel' branch to may only want to run VMs using the devel version of Bioc.
Exactly. Using the right VM for the right branch is very important. It's important to realize that trying to build/check a specific branch (e.g.
RELEASE_3_17
) of a Bioconductor package on the wrong container (e.g.bioconductor/bioconductor_docker:devel
) produces meaningless results. I was hoping that this would be somehow achievable via a single static workflow file, but, based on your explanations, it seems that one would need to edit the workflow file in the new branch after each release. This is a concern that I wish could be addressed.In the meantime, since your PR is against the
devel
branch, and the workflow file in this PR usesbioconductor/bioconductor_docker:devel
, I still don't understand whyRELEASE_**
is listed underbranches
:
I haven't made the changes yet. Was waiting to hear whether you'd approve of the multi-workflow approach. Will modify now.
I haven't made the changes yet. Was waiting to hear whether you'd approve of the multi-workflow approach. Will modify now.
Done!
Replying to @hpages comments here as requested:
As @vjcitn mentioned here, adoption of rworkflows for core Bioconductor packages, as well as its official endorsement for Bioconductor packages in general, would still require a multiparty discussion that goes beyond this particular PR.
Sure! Though that was proposed back in September so I think it would be good to figure out a concrete time to have that discussion. Otherwise, I think it's liable to fall off the radar indefinitely (understandably, we're all quite busy).
Perhaps the next TAB meeting would be an appropriate time? @vjcitn I realise there's a lot to discuss in these meetings, so if it's too much to pack into TAB perhaps a separate meeting could be scheduled?
I'd also argue using a tool in a couple of packages doesn't necessarily amount to "official Bioconductor-wide endorsement" of that tool, whether the tool is rworkflows
or otherwise. That, to my mind, is a separate issue from this PR.
Personally, I still share the same concern as @mtmorgan about the workflow "doing the right thing", and continuing to do the right thing all year round, with respect to versioning. See Martin's comment here for the details.
Fair enough, but if you'd like some input to the direction of rworkflows, you're always more than welcome to join the monthly Cloud Methods Working Group meetings. Just sent you an invite to the dedicated Slack channel in case you'd like to join the conversation.
You can also always propose any points of improvement as an Issue or a PR on the rworkflows repo.
Regardless, as I've mentioned before I do think there's a strong argument to have some kind of GitHub-based CI vs. none. I think rworkflows via this PR is the easiest avenue to that, and if you change your mind later it's trivial to remove it (just delete the workflow file).
So even if you're not a fan of rworkflows, I highly recommend you implement something sooner than later (e.g. biocthis, custom workflow scripts, etc.). As we've seen here, GH-based CI can identify issues before they're distributed to Bioc.
Best, Brian
Personally, I still share the same concern as @mtmorgan about the workflow "doing the right thing", and continuing to do the right thing all year round, with respect to versioning. See Martin's comment here for the details.
Fair enough, but if you'd like some input to the direction of rworkflows, you're always more than welcome to join the monthly Cloud Methods Working Group meetings. Just sent you an invite to the dedicated Slack channel in case you'd like to join the conversation.
I see that you added me to this channel. I didn't receive the invite.
You can also always propose any points of improvement as an Issue or a PR on the rworkflows repo.
All I want to know is if this is something you're planning to fix. This is a major blocking issue IMO and it's been mentioned and discussed many times.
All I want to know is if this is something you're planning to fix. This is a major blocking issue IMO and it's been mentioned and discussed many times.
@hpages To my knowledge, all changes you requested were completed a while ago. Is there anything else you'd like me to address?
@hpages To my knowledge, all changes you requested were completed a while ago. Is there anything else you'd like me to address?
Now I'm confused. How did you address the versioning concern? You still have:
r: devel
bioc: devel
cont: bioconductor/bioconductor_docker:devel
in the config file, which means that R devel is going to be installed and used on top of the bioconductor_docker:devel
container. As explained on several occasions in many different places, this will only be good half of the year, but not all year round. Also why would you do that if the container already has the right version of R?
What's even more confusing is that, one week ago, when I reminded you about this (see above), you seemed to aknowledge that this was still an issue (your answer was "Fair enough etc.."), and you even invited me to discuss it in the monthly Cloud Methods Working Group meetings. But now you say that all the changes I requested were completed... :thinking: ?!!
Maybe we should just close this for now and wait until things have been discussed and sorted out by the Cloud Methods Working Group. It was probably premature for you to submit a PR for something like this in the first place given that there are several alternatives around that should be put on the table by the working group. Once the working group makes a decision, they can always make an announcement on the bioc-devel mailing list and/or the community-bioc Slack.
I just added rworkflows to Rsamtools to have some immediate checks on your code (across 3 platforms) via GitHub Actions
There's still a couple of remaining Warnings and Notes from the previous version that i wasn't sure how to resolve. But since these were occurring in Rsamtools before I made my changes, I think it's safe to merge the PR and then sort out the source of these issues.
@vobencha @hpages @mtmorgan
Thanks, and let me know if you have any questions!
Best, Brian