Bioconductor / Contributions

Contribute Packages to Bioconductor
135 stars 33 forks source link

scDesign3 #2903

Closed SONGDONGYUAN1994 closed 1 year ago

SONGDONGYUAN1994 commented 1 year ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot commented 1 year ago

Hi @SONGDONGYUAN1994

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: scDesign3
Type: Package
Title: A unified framework of realistic in silico data generation and statistical model inference for single-cell and spatial omics
Version: 0.99.0
Authors@R: 
    c(person("Dongyuan", "Song", , "dongyuansong@ucla.edu", role = c("aut", "cre"),
 comment = c(ORCID = "0000-0003-1114-1215")),
    person("Qingyang", "Wang", , "qw802@g.ucla.edu",  role = c("aut"),
 comment = c(ORCID = "0000-0002-1051-609X")))
Description: scDesign3 is an all-in-one statistical simulator to generate realistic single-cell and spatial omics data, including various cell states, experimental designs, and feature modalities, by learning interpretable parameters from real datasets. Furthermore, using a unified probabilistic model for single-cell and spatial omics data, scDesign3 can infer biologically meaningful parameters, assess the goodness-of-fit of cell clusters and trajectories, and generate in silico negative and positive controls for benchmarking computational tools.
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: false
Depends: R (>= 4.2.0)
Imports:
    dplyr,
    tibble,
    stats,
    methods,
    mgcv,
    gamlss,
    gamlss.dist,
    SummarizedExperiment,
    SingleCellExperiment,
    mclust,
    Rfast,
    mvtnorm,
    parallel,
    pbmcapply,
    rvinecopulib,
    umap,
    ggplot2,
    irlba,
    viridis,
    BiocParallel
Suggests:
    magrittr,
    Seurat,
    Signac,
    matrixcalc,
    spcov,
    knitr,
    rmarkdown,
    testthat (>= 3.0.0),
    RefManageR,
    dyngen,
    cowplot,
    ggraph,
    ggrastr,
    gridExtra,
    reshape2,
    scales,
    spatialDE,
    DESeq2,
    DuoClustering2018,
    NBAMSeq,
    PseudotimeDE,
    SPARK,
    SeuratObject,
    aricode,
    ggh4x,
    ggpubr,
    ggrepel,
    igraph,
    monocle3,
    scater,
    scran,
    stringr,
    tidygraph,
    tradeSeq,
    useful,
    sessioninfo,
    gamlss.add,
    BiocStyle,
    tidyr,
    tidyselect
biocViews:
    Software,
    SingleCell,
    Sequencing,
    GeneExpression,
    Spatial
URL: https://github.com/SONGDONGYUAN1994/scDesign3
BugReports: https://github.com/SONGDONGYUAN1994/scDesign3/issues
RoxygenNote: 7.2.3
Config/testthat/edition: 3
VignetteBuilder: knitr
bioc-issue-bot commented 1 year ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "TIMEOUT, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/scDesign3 to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

hpages commented 1 year ago

Hi @SONGDONGYUAN1994 ,

Thanks for submitting scDesign3 to Bioconductor.

The package contains 13 vignettes and they take too long to build. I started R CMD build scDesign3 on my laptop half an hour ago and it's still running the "creating vignettes ..." step! Hence the TIMEOUT reported by the SPB (Single Package Builder) 2 days ago. See above.

Per our guidelines, the vignettes in a Bioconductor package should take less than 5 min to build. So you need to work on making the code in the vignette faster. Please read our guidelines carefully. We expect package developers to read them and have their package satisfy them before submitting their package.

Alternatively, you could move all these vignettes to a separate "workflow package", and keep a small vignette in scDesign3. See here for the list of workflow packages currently available in Bioconductor. The advantage of this separation is that the vignette in a workflow package is not required to build in less than 5 min like for a software package. Note however that a workflow package can only contain 1 vignette (the document showing the workflow), so all the vignettes that you put there would need to be merged into a single vignette.

Also:

Please note that this is not a review yet, just things that will need to be addressed before I actually start the review.

Thanks, H.

SONGDONGYUAN1994 commented 1 year ago

Hi Hervé, Thank you so much for your help, and I apologize for all the inconvenience. I will address the problems ASAP. May I ask a few questions about those issues?

  1. Too long vignettes: sorry that this is an annoying problem. I still want to keep all vignettes in scDesign3 since I used the pkgdown to build its website. I am afraid that the code itself cannot be faster. My current idea is to pre-save the intermediate data somewhere and set eval = FALSE for those time-consuming code chunks. Would you think this is appropriate for Bioconductor?
  2. For Rfast, what I really need is : Rfast::cora. for the calculation of correlation matrix of large matrices (thousands of genes and tens of thousands of cells). Would you mind suggesting a more mainstream alternative for it?

I will check the dependencies and refine them soon. Again, thanks for your support!

Best, Dongyuan

hpages commented 1 year ago

Hi Dongyuan,

My bad, when I said that a workflow package can only contain 1 vignette (the document showing the workflow), that's not true. A workflow package can actually contain several Rmd documents. See for example the RNAseq123, the chipseqDB, or the simpleSingleCell workflow.

Sorry for the confusion.

I still want to keep all vignettes in scDesign3 since I used the pkgdown to build its website.

I don't see why you couldn't achieve the same thing with a workflow package. Keep all the R code and datasets in scDesign3, and move the vignettes to the workflow package.

For Rfast and your need for Rfast::cora(), are your matrices sparse? (this is typical for single cell data). If so, have you considered using a sparse matrix representation like the dgCMatrix container from the Matrix package? Then I believe that you should be able to use functions from the sparseMatrixStats package to efficiently compute your correlation matrices. If your matrices are so big that they don't fit in memory, please consider using an on-disk representation like HDF5Matrix from the HDF5Array package, or TileDBMatrix from the TileDBArray package. Then use functions from the DelayedMatrixStats package to compute your correlation matrices for your HDF5Matrix or TileDBMatrix objects.

Hope this helps,

H.

SONGDONGYUAN1994 commented 1 year ago

Hi Hervé, Sorry for the delay, and thank you so much for your suggestion! Unfortunately, our correlation matrix is an intermediate step and is dense, so the sparse matrix does not work for us. Instead, we apply a faster version of correlation calculation from Rfast. We do remove the Rfast completely, and now the dependencies are much lighter than before. We also remove most vignettes. We would appreciate your further instructions. Thanks again!

Best regards, Dongyuan

hpages commented 1 year ago

Hi Dongyuan,

If you think that the package is ready for me to take another look, please bump its version so we get a new build report. In the event that the new report reveals new issues, please make sure to address them (and bump the version again).

Thanks, H.

hpages commented 1 year ago

@SONGDONGYUAN1994 Are you planning to follow up with this submission?

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 46c6757deb6219cef7dade14acf55c4c2ed6cb05

SONGDONGYUAN1994 commented 1 year ago

Hi Hervé, I apologize for the delay, and we now bump the version number and push it to Bioconductor. I would response to your suggestions and modify our package timely. Thank you so much!

Best regards, Dongyuan

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/scDesign3 to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: eb9d2e44ce0e3a8d9830d0c7b8a1d8fd73105114

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/scDesign3 to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: a2a4458b3379b46c981228ef3a366ee79dd96cb9

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/scDesign3 to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: aa7ea43ea289e6160c26951a4f82061b4f10fe3f

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/scDesign3 to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

hpages commented 1 year ago

Hi Dongyuan,

Ok so with all the vignettes (except one) now renamed by adding the .orig suffix to them, R CMD build is able to complete in less than 5 mins. That's one way to address the issue but is not a satisfying one. Why would you disable all the vignettes when you can keep them all in a workflow package as mentioned previously? This is not hard to do so I would strongly encourage you to consider that option.

Also instead of downloading the dozen or so datasets used in the vignettes each time with things like:

MOBSC_sce <- readRDS((url("https://figshare.com/ndownloader/files/40581983")))

I'd suggest that you download them to the user cache (i.e. to tools::R_user_dir("scDesign3", which="cache")) so they are downloaded once only to the user machine (and to our build machines). This will not only save bandwidth but will also speedup further executions of the code in the vignettes, and so will make further runs of R CMD build faster. See https://contributions.bioconductor.org/data.html#other-data and https://contributions.bioconductor.org/r-code.html#web-querying-and-file-caching for our guidelines with respect to data download.

Note that some of your datasets (e.g. SCGEMMETH_sce and SCGEMRNA_sce) are on dropbox.com in addition to figshare.com, and you sometimes dowload them from one place and sometimes from the other place. Please download everything from figshare.com and avoid using dropbox.com to host scientific datasets.

Thanks, H.

SONGDONGYUAN1994 commented 1 year ago

Hi Hervé, Thank you for your suggestions! We would move the vignettes to a new workflow package and re-write them to avoid repeated downloading. The files from Dropbox should have been changed to figShare and we will fix them this time.

Best regards, Dongyuan

lshep commented 1 year ago

@SONGDONGYUAN1994 is there an update on this issue? We like to see progress in a 2-3 week time frame to keep packages moving through the review process and add to Bioconductor as soon as possible. If there were changes they should have been pushed to the git.bioconductor.org repo to produce a new build report. See the information provided in https://github.com/Bioconductor/Contributions/issues/2903#issuecomment-1440527950

SONGDONGYUAN1994 commented 1 year ago

Hi lshep, Sorry for the delay, since we kept receiving issues on our GitHub in the past month. We will push our new version today. Thanks!

Best, Dongyuan

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: c3a32af466e21d7cf13235be4e3607900974d0dc

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: macOS 12.6.5 Monterey: scDesign3_0.99.6.tar.gz Linux (Ubuntu 22.04.2 LTS): scDesign3_0.99.6.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/scDesign3 to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

SONGDONGYUAN1994 commented 1 year ago

AdditionalPackage: https://github.com/SONGDONGYUAN1994/scDesign3Workflow

bioc-issue-bot commented 1 year ago

Hi @SONGDONGYUAN1994, Thanks for submitting your additional package: https://github.com/SONGDONGYUAN1994/scDesign3Workflow. We are taking a quick look at it and you will hear back from us soon.

SONGDONGYUAN1994 commented 1 year ago

Hi, I have bumped the scDesign3 and created a new workflow package scDesign3Workflow. Thank you so much!

Best, Dongyuan

hpages commented 1 year ago

Thanks @SONGDONGYUAN1994 for submitting scDesign3Workflow as an additional package.

I'm not sure the package got properly ingested to git.bioconductor.org or to the SPB though, as I'm not able to clone https://git.bioconductor.org/packages/scDesign3Workflow and it doesn't seem like we got a build/check report from the SPB yet.

Maybe some manual intervention is required on our end? @vjcitn @lshep Any thoughts? Thx

lshep commented 1 year ago

Give me a minute or two. Adding it now and kicking off the build.

bioc-issue-bot commented 1 year ago

Additional Package has been approved for building.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

JSB-UCLA commented 1 year ago

Thanks for your help! Please let me know if I need to update anything.

Best, Dongyuan

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/scDesign3Workflow to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

hpages commented 1 year ago

Thanks @lshep!

@SONGDONGYUAN1994 @JSB-UCLA Please address the BUILD error (and any other remaining issues) reported by the SPB. Thanks!

SONGDONGYUAN1994 commented 1 year ago

Hi, I have bumped scDesign3Workflow from v0.99.0 to 0.99.1. However, I got the error when I push it: dongyuan@lambda-server:~/package_development/scDesign3Workflow$ git push upstream main:devel error: src refspec main does not match any error: failed to push some refs to 'git@git.bioconductor.org:packages/scDesign3Workflow.git'

Did I miss anything? Sorry for the trouble!

Best, Dongyuan

lshep commented 1 year ago

The set up is the same as the other package. Have you set your remotes and done a git fetch --all ? Just to be clear your local branch you are working on is defaulted to main and you are currently checked out on a branch called main?

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 81a8828291ececd72337f23b9a5aefcc8e36f839

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/scDesign3Workflow to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: c829784feb18c81415bded68c9ea9901847247a9

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: Linux (Ubuntu 22.04.2 LTS): scDesign3_0.99.7.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/scDesign3 to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

SONGDONGYUAN1994 commented 1 year ago

Hi, Since I need to fix the error in scDesign3Workflow, I updated scDesign3, bumped the version to 0.99.7, but I got the error this time (weird, this error seems to be nothing related to the package but the download link):

lconway BUILD SRC output

R CMD BUILD

===============================

Quitting from lines 58-60 [unnamed-chunk-3] (scDesign3.Rmd) Error: processing vignette 'scDesign3.Rmd' failed with diagnostics: cannot read from connection --- failed re-building scDesign3.Rmd

SUMMARY: processing the following file failed: scDesign3.Rmd

Error: Vignette re-building failed. Execution halted

58-60 is this chunk:

example_sce <- readRDS((url("https://figshare.com/ndownloader/files/40581992")))
print(example_sce)

Thank you so much!

Best, Dongyuan

hpages commented 1 year ago

Hi Dongyuan,

I cannot reproduce this error on lconway:

lconway:sandbox biocbuild$ time R CMD build --keep-empty-dirs --no-resave-data scDesign3
* checking for file ‘scDesign3/DESCRIPTION’ ... OK
* preparing ‘scDesign3’:
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* looking to see if a ‘data/datalist’ file should be added
* building ‘scDesign3_0.99.7.tar.gz’

real    1m13.594s
user    0m56.509s
sys 0m8.905s

so the error we see on the SPB report was probably because of lconway having some intermittent internet access issue. Let's ignore it.

However, the scDesign3Workflow package fails for real (see SPB report from last week), apparently because it suggests the following packages which are not available in Bioconductor or CRAN: monocle3, PseudotimeDE, SPARK. Please note that we don't allow this.

Thanks, H.

SONGDONGYUAN1994 commented 1 year ago

@hpages Thank you so much! For scDesign3Workflow, my question is how should I deal with this: it suggests the following packages which are not available in Bioconductor or CRAN: monocle3, PseudotimeDE, SPARK

These packages are only on their Github repo. I used them in one vignittee for benchmarking their performance thus they must be used. Thank you!

Best, Dongyuan

hpages commented 1 year ago

You can either:

The problem is that, generally speaking, R packages that exist only on GitHub offer no guarantee of stability and don't have any formal commitment to fixing bugs or perform any sort of QA. Also most of them won't bother to ensure that the package can be installed on Windows or Mac, and they won't provide binaries for these platforms either. This doesn't play well with the mission of Bioconductor which is to distribute robust and stable software that is easy to install on Linux, Windows, and Mac.

Thanks, H.

SONGDONGYUAN1994 commented 1 year ago

Thanks! I got it now. I will update scDesign3WorkFlow ASAP; it seems like I have to eliminate them.

Best, Dongyuan

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 75a9984e1e1b851393927fa8230ed1cd8b4d14e1

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/scDesign3Workflow to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: b5dffe16fa0377903e9009d825df361c83e88ad9

hpages commented 1 year ago

@lshep Looks like we never got the report for the latest build (started 5 days ago).

@SONGDONGYUAN1994 Can you try to bump the version of the package again and push? Thanks

SONGDONGYUAN1994 commented 1 year ago

Thanks! Let me try this right now.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: e4da407261619e550c181a61243f72ac4b587cc2