immunogenomics / presto

Fast Wilcoxon and auROC
144 stars 33 forks source link

CRAN prep #19

Closed AustinHartman closed 1 year ago

AustinHartman commented 1 year ago

Work in progress preparing presto for CRAN submission. Looking for any feedback related to styling, documentations, etc.

I moved DESeq2 from imports to suggests given it's used for the pseudobulk testing, but not for the main wilcox rank sum workflow. Also, since DESeq2 is a bioconductor package it will make it easier for CRAN packages, like Seurat, to import presto.

slowkow commented 1 year ago

Hi Austin, thanks for getting this started. At a glance, I like your first round of changes.

Our GitHub actions are failing:

Error: Unable to resolve action `r-lib/actions@master`, unable to find version `master`

Could I please ask you to go ahead and make these changes to the .github/workflows/R-CMD-check.yaml file:

https://github.com/immunogenomics/presto/blob/052085db9c88aa70a28d11cc58ebc807999bf0ad/.github/workflows/R-CMD-check.yaml#L36

This should be:

r-lib/actions/setup-r@v2.3.0

https://github.com/immunogenomics/presto/blob/052085db9c88aa70a28d11cc58ebc807999bf0ad/.github/workflows/R-CMD-check.yaml#L40

This should be:

r-lib/actions/setup-pandoc@v2.3.0

I think the branch called master was removed from the r-lib/actions repo, so we should use the tags instead.

AustinHartman commented 1 year ago

A few changes to the GHA:

I played around with the GHA on my fork of the repo and checks passed on windows and macOS but not yet on ubuntu which I'm still sorting out

slowkow commented 1 year ago

Hey Austin,

Thanks for working on this. Below, I share some notes on what I could observe from the logs.

In summary, I think we're on the right track, and I believe many potential GHA failures are out of your control. I might suggest running R CMD check on your laptop to avoid losing time due to these GHA issues.

Thanks for your help and patience, Kamil

Details

I noticed in your repo that macOS passed the GHA but the others did not. Weird.

I looked at one of the failed GHA runs and found this:

Quitting from lines 117-120 (getting-started.Rmd) 
Error: Error: processing vignette 'getting-started.Rmd' failed with diagnostics:
unable to find required package 'SingleCellExperiment'
--- failed re-building ‘getting-started.Rmd’
SUMMARY: processing the following file failed:
  ‘getting-started.Rmd’
Error: Error: Vignette re-building failed.
Execution halted

Here are the lines:

https://github.com/immunogenomics/presto/blob/51eefa7422c6751e8c66008ae6911c17a1389c7d/vignettes/getting-started.Rmd#L116-L120

At first I thought maybe we failed to add SingleCellExperiment to the dependencies. But I checked DESCRIPTION and see that we include the package in Suggests.

Then I checked the log file in GHA and found this:

trying URL 'https://bioconductor.org/packages/3.14/bioc/src/contrib/SingleCellExperiment_1.16.0.tar.gz'
Error in download.file(url, destfile, method, mode = "wb", ...) : 
  cannot open URL 'https://bioconductor.org/packages/3.14/bioc/src/contrib/SingleCellExperiment_1.16.0.tar.gz'
In addition: Warning message:
In download.file(url, destfile, method, mode = "wb", ...) :
  cannot open URL 'https://mghp.osn.xsede.org/bir190004-bucket01/archive.bioconductor.org/packages/3.14/bioc/src/contrib/SingleCellExperiment_1.16.0.tar.gz': HTTP status was '503 Service Unavailable'
Warning in download.packages(pkgs, destdir = tmpd, available = available,  :
  download of package ‘SingleCellExperiment’ failed

This indicates that the package is not downloading successfully. Maybe we are unlucky.

Just now, I was able to reproduce the 503 error in my web browser. A few moments later, I am now able to successfully download this file:

https://bioconductor.org/packages/3.14/bioc/src/contrib/SingleCellExperiment_1.16.0.tar.gz

I might conclude that the bioconductor server was momentarily unavailable.

Then, I decided to launch the GHA jobs in this pull request and I saw a new error message:

Run Rscript -e "remotes::install_github('r-hub/sysreqs')"
Using bundled GitHub PAT. Please add your own PAT to the env var `GITHUB_PAT`
Error: Error: Failed to install 'unknown package' from GitHub:
  cannot open URL 'https://api.github.com/repos/r-hub/sysreqs/contents/DESCRIPTION?ref=HEAD'
Execution halted
Error: Process completed with exit code 1.

According to this comment, the error might go away if we just re-run the GHA jobs: https://github.com/r-lib/remotes/issues/130#issuecomment-870680052

Conclusions

How to run R CMD check quickly

Recently, I had to run CRAN check on ggrepel a few times, and I was frustrated that each run took a lot of time. Then I found a comment by Jim Hester:

You can set the following environment variables to false to turn off the incoming checks. The first turns off all incoming checks, the second only turns off those checks which use remote resources (i.e. the internet). The latter is only available on relatively new versions of R. — Jim Hester

#export _R_CHECK_CRAN_INCOMING_=false
export _R_CHECK_CRAN_INCOMING_REMOTE_=false

R CMD build ggrepel

R CMD check --as-cran ggrepel_0.9.2.tar.gz

After setting the environment variable _R_CHECK_CRAN_INCOMING_REMOTE_ my checks were running much more quickly, so my development cycle became much faster.

AustinHartman commented 1 year ago

Thanks for investigating the logs and suggestions. I've been running R checks locally without any errors warnings or notes on an ubuntu 20.04 system with R 4.1.3, so the GHA errors were indeed perplexing. Since the GHA errors seem to be due to inconsistency in dependency installation, I can focus a bit more on documentation. Anything else I'm missing before merging and CRAN submission?

slowkow commented 1 year ago

Thank you, Austin!

Would you like to make more changes (e.g. clearer error messages, documentation)?

One thing that users might appreciate is a pkgdown website with some usage examples and tips. I think an appropriate URL for this page would be https://immunogenomics.github.io/presto

If you're ready for a merge, please let me know and I'll take it from here.

AustinHartman commented 1 year ago

I've made a few changes to the docs and examples and added a pkgdown site in the recent commits. Let me know if there are other changes you'd like, otherwise I think this is ready to merge

slowkow commented 1 year ago

@AustinHartman I forgot to ask, could you please provide your information so we can add you as a contributor in the DESCRIPTION file?

Here's mine (I will add myself, too, since I'm contributing):

person("Kamil", "Slowikowski", role = c("ctb"), comment = c(ORCID = "0000-0002-2843-6370") )
AustinHartman commented 1 year ago

Thanks - here's my info: person("Austin", "Hartman", role = c("ctb"), comment = c(ORCID = "0000-0001-7278-1852"))