haibol2016 commented 4 months ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Repository: https://github.com/haibol2016/CleanUpRNAseq

Confirm the following by editing each check box to '[x]'

[X] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[X ] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[ X] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.
[X ] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[X ] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[ X] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
[X ] I am familiar with the Bioconductor code of conduct and agree to abide by it.

I am familiar with the essential aspects of Bioconductor software management, including:

[ X] The 'devel' branch for new packages and features.
[ X] The stable 'release' branch, made available every six months, for bug fixes.
[X ] Bioconductor version control using Git (optionally via GitHub).

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot commented 4 months ago

Hi @haibol2016

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: CleanUpRNAseq
Type: Package
Title: CleanUpRNAseq detects and corrects for genomic DNA contamination
       in RNA-seq data
Version: 0.99.0
Authors@R: c(person("Haibo", "Liu", 
   role = c("aut", "cre"),
   comment=c(ORCID="0000-0002-4213-2883"),
   email = "haibo.liu@umassmed.edu"),
   person("Kevin", "O'Connor", role = "ctb", 
   email = "Kevin.O'Connor@umassmed.edu"),
   person("Michelle", "Kelliher", role = "ctb", 
   email = "Michelle.Kelliher@umassmed.edu"),
   person("Lihua Julie", "Zhu", role = "aut", 
   email = "Julie.Zhu@umassmed.edu"), 
   person("Kai", "Hu", role = "aut",
   email = "kai.hu@umassmed.edu"))
Description: RNA-seq data generated by some library preparation methods, such
   as rRNA-depletion-based method and the SMART-seq method, might be 
   contaminated by genomic DNA (gDNA), if DNase I disgestion is not
   performed properly during RNA preparation. CleanUpRNAseq is
   developed to check if RNA-seq data is suffered from gDNA 
   contamination. If so, it can perform correction for gDNA
   contamination and reduce false discovery rate of differentially
   expressed genes.
biocViews: QualityControl, Sequencing, GeneExpression
License: GPL-3
Imports: ggplot2, plyranges, DESeq2, Rsamtools, Rsubread, pheatmap, reshape2,
         GenomicRanges, edgeR, ensembldb, ggrepel, tximport, utils, magrittr,  
         grDevices, stats, graphics, Biostrings,SummarizedExperiment, methods, 
         devtools, BSgenome,qsmooth, AnnotationFilter, patchwork,
         RColorBrewer, GenomeInfoDb, BiocGenerics, limma
Depends: R (>= 4.0.0)
Suggests: BiocManager, rmarkdown, RUnit, knitr, EnsDb.Hsapiens.v86, ggplotify,
BiocStyle, R.utils, AnnotationHub, BSgenome.Hsapiens.UCSC.hg38,
VignetteBuilder: knitr
RoxygenNote: 7.3.1
Roxygen: list(markdown = TRUE)
Encoding: UTF-8
LazyData: true

haibol2016 commented 4 months ago

How long will it take to get a package submitted?

lshep commented 4 months ago

I moderate packages normally once a week. Once moderated, if a package passes the pre-check, it will build/check on our system and will require a clean build/check to be assigned a reviewer. If there were pre-review concerns those should be addressed before it will be put on our system. Reviewers generally respond in 1-2 week time frame. Any concerns should be addressed and then once accepted we will add to the Bioconductor public repository.

[ ] Please don't have an empty news file. You can initiate with something like

# Version 0.99.0

*  initial submission to Bioconductor

[ ] You will also need an inst/script directory that describes how the data in inst/extdata is created. It can be text, pseudo code or code but should describe how a user would re-create data and contain any licensing/source information.
[ ] Please remove CleanUpRNAseq.html. This file should be generated automatically from the RMD during the R CMD build process.
[ ] Data can not be stored on a dropbox or google drive. Please move to a secure location preferrably zenodo but an institutional server or ExperimentHub would also suffice.

haibol2016 commented 3 months ago

All the listed issues are fixed. Thank you.

bioc-issue-bot commented 3 months ago

Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CleanUpRNAseq to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

haibol2016 commented 3 months ago

Hello, I ran devtools::check() on my Windows platform a moment ago. I didn't encounter any error, though. Could you let me know the platform information where the package was checked? Based on th error report, I can't figure out what caused the error.

Thank you.

Haibo

vjcitn commented 3 months ago

The platform information is on the report: https://bioconductor.org/spb_reports/CleanUpRNAseq_buildreport_20240607124952.html#nebbiolo2_buildsrc_anchor

Linux (Ubuntu 22.04.3 LTS)/x86_64

The error is pretty clear -- a function is called with argument maxMOp, but the function doesn't want such an argument 'CleanUpRNAseq.Rmd' failed with diagnostics: unused argument (maxMOp = 10)

maybe it is a spelling error or the function's requirements changed

haibol2016 commented 3 months ago

Thank you for you reply. However, in the Vignette Line 194-219, there is no such argument at all: https://github.com/haibol2016/CleanUpRNAseq/blob/devel/vignettes/CleanUpRNAseq.Rmd#L194-L219.

Haibo

lshep commented 3 months ago

> 
invisible({
    capture.output({
        counts_list <- summarize_reads(
            metadata = metadata,
            isPairedEnd = TRUE,
            strandSpecific = 0,
            saf_list = saf_list,
            gtf = gsub(".gz", "", gtf),
            threads = 1,
            verbose = FALSE
        )
    })
})
Error in featureCounts(files = bamfiles, annot.ext = annotation, isGTFAnnotationFile = isGTF,  : 
  unused argument (maxMOp = 10)

Reproducible. While this is not directly in this code I think it is a cascade affect from Rsubread::featureCounts removing this argument recently. You are still using that argument in your summarize_reads which eventually gets called through this code in your vignette

haibol2016 commented 3 months ago

Got it. I will correct it. Thank you.

Haibo

On Mon, Jun 10, 2024 at 8:22 AM lshep @.***> wrote:

invisible({ capture.output({ counts_list <- summarize_reads( metadata = metadata, isPairedEnd = TRUE, strandSpecific = 0, saf_list = saf_list, gtf = gsub(".gz", "", gtf), threads = 1, verbose = FALSE ) }) }) Error in featureCounts(files = bamfiles, annot.ext = annotation, isGTFAnnotationFile = isGTF, : unused argument (maxMOp = 10)

Reproducible. While this is not directly in this code I think it is a cascade affect from Rsubread::featureCounts removing this argument https://code.bioconductor.org/search/search?q=maxMOp recently. You are still using that argument in your summarize_reads which eventually gets called through this code in your vignette

— Reply to this email directly, view it on GitHub https://github.com/Bioconductor/Contributions/issues/3442#issuecomment-2158201671, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALR5WD36RR6632WTXO5UUDLZGWLBTAVCNFSM6AAAAABIGWPPH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJYGIYDCNRXGE . You are receiving this because you were mentioned.Message ID: @.***>

haibol2016 commented 3 months ago

Thank you. I just fixed the bug.

Haibo

On Mon, Jun 10, 2024 at 1:07 PM Haibo Liu @.***> wrote:

Got it. I will correct it. Thank you.

Haibo

On Mon, Jun 10, 2024 at 8:22 AM lshep @.***> wrote:

invisible({ capture.output({ counts_list <- summarize_reads( metadata = metadata, isPairedEnd = TRUE, strandSpecific = 0, saf_list = saf_list, gtf = gsub(".gz", "", gtf), threads = 1, verbose = FALSE ) }) }) Error in featureCounts(files = bamfiles, annot.ext = annotation, isGTFAnnotationFile = isGTF, : unused argument (maxMOp = 10)

Reproducible. While this is not directly in this code I think it is a cascade affect from Rsubread::featureCounts removing this argument https://code.bioconductor.org/search/search?q=maxMOp recently. You are still using that argument in your summarize_reads which eventually gets called through this code in your vignette

— Reply to this email directly, view it on GitHub https://github.com/Bioconductor/Contributions/issues/3442#issuecomment-2158201671, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALR5WD36RR6632WTXO5UUDLZGWLBTAVCNFSM6AAAAABIGWPPH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJYGIYDCNRXGE . You are receiving this because you were mentioned.Message ID: @.***>

lshep commented 3 months ago

You need to push to bioconductor and trigger a new build to continue. Please see this previous message

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 45af654bc4f15f7b87850d500d95ea28705a10b3

haibol2016 commented 3 months ago

Thank you, lshep. I finally figured out how to do that. It is first time to sumbit a package to Bioconductor. If feel it is not easy to follow the direction here [this documentation](http://contributions.bioconductor.org/git-version-control.html#new-package-workflow].

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CleanUpRNAseq to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: f4a109f20d392c32d9ec4f89c9f2b97035226a28

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.2.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CleanUpRNAseq to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 26fad64a3679b698a04fba5aca412e0bebe88351

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.3.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CleanUpRNAseq to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 8629168df8aadbbca19c6da54597d1d334f91fdc

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.4.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CleanUpRNAseq to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: e294023b2ad17fbcea215ac700d3d6cbfea36777

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.5.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CleanUpRNAseq to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 52d04cbae3b0046d2b9785972cfdde5462f29733

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.6.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CleanUpRNAseq to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 78ae3a757a70a2710693d95b9f5c0c8a57c5ebde

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.7.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CleanUpRNAseq to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: fb63e1105a70925659911bb2515af7bc9c149347

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.9.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CleanUpRNAseq to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 4ba2a247b59a57fd75b35da9ed26aa19179d9021

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 71192a90d87285567bea3b0a65cb720b42a66b42

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.10.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CleanUpRNAseq to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.11.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CleanUpRNAseq to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

haibol2016 commented 3 months ago

Hello,

   My package is now error-free but with a warning message, "WARNING: R

CMD check exceeded 10 min requirement". This is because many examples are run. Is this OK?

Thank you,

Haibo

On Mon, Jun 3, 2024 at 10:58 AM lshep @.***> wrote:

I moderate packages normally once a week. Once moderated, if a package passes the pre-check, it will build/check on our system and will require a clean build/check to be assigned a reviewer. If there were pre-review concerns those should be addressed before it will be put on our system. Reviewers generally respond in 1-2 week time frame. Any concerns should be addressed and then once accepted we will add to the Bioconductor public repository.

Please don't have an empty news file. You can initiate with something like

Version 0.99.0

initial submission to Bioconductor

-

You will also need an inst/script directory that describes how the data in inst/extdata is created. It can be text, pseudo code or code but should describe how a user would re-create data and contain any licensing/source information.

Please remove CleanUpRNAseq.html. This file should be generated automatically from the RMD during the R CMD build process.

Data can not be stored on a dropbox or google drive. Please move to a secure location preferrably zenodo but an institutional server or ExperimentHub would also suffice.

— Reply to this email directly, view it on GitHub https://github.com/Bioconductor/Contributions/issues/3442#issuecomment-2145421169, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALR5WD3IMDEIDK3FWWNYUT3ZFSABDAVCNFSM6AAAAABIGWPPH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBVGQZDCMJWHE . You are receiving this because you were mentioned.Message ID: @.***>

bioc-issue-bot commented 3 months ago

A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.

LiNk-NY commented 2 months ago

Hello, My package is now error-free but with a warning message, "WARNING: R CMD check exceeded 10 min requirement". This is because many examples are run. Is this OK? Thank you, Haibo

Hi @haibol2016

Please make sure that the package examples do not exceed the time limit. Use small example datasets to demonstrate the functionality rather than biological accuracy.

Best, Marcel

haibol2016 commented 2 months ago

Hello Marcel,

      Is this really a dead rule to keep the R CMD check time < 10 min? The manuscript reporting this package has been accepted by a journal and we are expected to submit the proofread this week. We need a Bioconductor link to show that the package has been accepted by Bioconductor.  If  the issue of exceeding 10 min runtime is not that important and not other problems detected, could you accept the package as it is? I will try my best to change the examples to shorten the R CMD check time shorten than 10 min as soon as possible.

Thank you,

Haibo

LiNk-NY commented 2 months ago

Hi Haibo, @haibol2016 Yes, due to the high costs of running our systems, we would like that all packages conform to this rule. Best regards, Marcel

bioc-issue-bot commented 2 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: c937557adfaf3d3721234824feb5ed6b45b09d79

bioc-issue-bot commented 2 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: macOS 12.7.1 Monterey: CleanUpRNAseq_0.99.12.tar.gz Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.12.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CleanUpRNAseq to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 2 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 5ef7ce8cae95ac124005b1e98d0fe1dcd8ee1d6c

bioc-issue-bot commented 2 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: macOS 12.7.1 Monterey: CleanUpRNAseq_0.99.13.tar.gz Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.13.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CleanUpRNAseq to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

haibol2016 commented 2 months ago

Hello Marcel,

 I fixed the timeout issue. Please review the package again.

Best,

HAibo

LiNk-NY commented 2 months ago

Hi Haibo, @haibol2016

Thank you for your contribution. Please see the review below.

Best regards, Marcel

CleanUpRNAseq #3442

The package is quite extensive and has a lot of functionality. It would be good to consider splitting the package into smaller packages with more focused functionality. Currently, it tries to do too much, i.e. data processing, statistical metrics, plotting, BSgenome forging, etc. Consider also contributing to existing packages such as BSgenomeForge and others.

DESCRIPTION

add BugReports and URL fields
Remove BiocManager from Suggests field
Update Depends field to R (>= 4.4.0)
Remove LazyData field or set to false
Consider using desc::desc to reformat the DESCRIPTION file.

NAMESPACE

UCSC2Ensembl may require a more descriptive name
Perhaps helper functions (e.g., forge_BSgenome, make_BSgenome, make_ensdb) should be in a separate package?
Consider using shorter function names

vignettes

Consider removing the lightweight wrappers from the vignette and pacakge, e.g., make_ensdb
Enable caching in the vignette and use BiocFileCache for downloading files rather than retry_download
Avoid invisible and capture.output in the vignette
No need to explicitly use print() in the vignette
Are the plots provided by the functions in the package? If so, consider removing the saved png files from the vignette as they are generated by the functions in the package.

R

Use BiocFileCache for downloading files in the examples
Avoid the extra emphasis (!) in error messages.
Use seq_len or seq_along instead of 2:ncol(...) in check_read_assignment_stat
Avoid mixing analysis and graphing functions and consider separating functionalities (in check_read_assignment_stat, check_read_distribution, check_sample_correlation, etc.)
It seems like some of the functions should be using SummarizedExperiment to represent the counts e.g., check_expr_distribution.
Consider reducing the cyclomatic complexity of functions by creating a class (with an established metadata slot) that automatically checks the data rather than having multiple checks in the functions (e.g., check_expressed_gene_percentage).
Remove retry_download and use BiocFileCache for downloading files.
Remove dontrun from the examples and use BiocFileCache to retrieve the downloaded external files.
The arguments for correct_for_contamination seem to be overloaded and could be helped by using a class that already has the metadata information.
Avoid saving PDF or any output files for the user by default. The user should be able to save the output in the desired format (in create_diagnostic_plot).
Avoid using vague function names like exploratory_analysis. This may more appropriate for a separate package with a set of functions for EDA.
Consider removing the cache_env feature in the package. Interactive analyses will use the .GlobalEnv.
For usability, it would be preferable to avoid syntax such as cache_env$counts_summary$gtf$counts_junction.
Where possible, avoid using numeric indices and use names instead (found in gc_bias_correction).
For UCSC2Ensembl, have you considered using seqlevelsStyle to convert the seqlevels instead? The function here might not be necessary.
Consider moving make_BSgenome and make_ensdb creation functions into a separate package.

tests

We strongly recommend adding unit tests to ensure that the code in the package is robust and reliable.

haibol2016 commented 2 months ago

Hello Marcel,

    Thank you very much for the detailed review and suggestions. This is my first full-blown R package. I know I still can improve the package. However, given that our manuscript has been accepted and we have to submit the proofread in a short time, I don't think it is possible to overhaul the package at this moment.

    Is it possible to accept the package as it is this moment? Otherwise, I will have to remove "Bioconductor" from the title of the manuscrip: "CleanUpRNAseq: An R/Bioconductor Package for Detecting and Correcting DNA Contamination in RNA-Seq Data".

Thank you,

Haibo

LiNk-NY commented 2 months ago

Hi Haibo, @haibol2016

Sorry to hear that and apologies for the delay in the review.

Unfortunately for now, we cannot accept the package as-is but we are happy to accept the package after consideration of the review and the appropriate changes.

Best regards, Marcel

haibol2016 commented 2 months ago

Hello Marcel, For time reason, I may not to fix all the issues you pointed out. Do I have to address all of the issues to my package to be considered by Bioconductor? Otherwise, could you please let me know what are compulsory and what are suggestion?

Thanks,

Haibo

LiNk-NY commented 2 months ago

Hi @haibol2016 Yes, please respond to the review before proceeding. Best, Marcel

haibol2016 commented 2 months ago

Hello Marcel, I will respond to your comment below each of them in italics.

The package is quite extensive and has a lot of functionality. It would be good to consider splitting the package into smaller packages with more focused functionality. Currently, it tries to do too much, i.e. data processing, statistical metrics, plotting, BSgenome forging, etc. Consider also contributing to existing packages such as BSgenomeForge and others.

I have decided to move the helper fuctions for creating BSgenome from a multifasta file out of the package.

DESCRIPTION add BugReports and URL fields Remove BiocManager from Suggests field Update Depends field to R (>= 4.4.0) Remove LazyData field or set to false Consider using desc::desc to reformat the DESCRIPTION file.

These issues in the DESCRIPTION file have been easily fixed using desc::desc. Thank you for the suggested tool.

NAMESPACE UCSC2Ensembl may require a more descriptive name Perhaps helper functions (e.g., forge_BSgenome, make_BSgenome, make_ensdb) should be in a separate package? Consider using shorter function names

_make_ensdb is deleted because it is a shallow wrapper function. forge_BSgenome and make_BSgenome and the other related to create a BSgenome from a multifasta file have been removed from the package.

The UCSC2Ensembl function is renamed as styleBSgenome.

vignettes Consider removing the lightweight wrappers from the vignette and pacakge, e.g., make_ensdb

_makeensdb function is deleted

Enable caching in the vignette and use BiocFileCache for downloading files rather than retry_download

Yes, I will use BiocFileCache for downloading files.

Avoid invisible and capture.output in the vignette

I used these functions so that the large amount of information output by featureCounts will not pollute the vignette. I can remove those functions if it is OK with lots of irrelevant information are captured in the vignette.

No need to explicitly use print() in the vignette

print() is removed

Are the plots provided by the functions in the package? If so, consider removing the saved png files from the vignette as they are generated by the functions in the package.

Because the time limit of 10 mins, I saved the png files to the vignette directory from a interactive run of the R chuncks to shorten the runtime.

R Use BiocFileCache for downloading files in the examples

Yes, BioCFileCache is a very good package to meet my purpose here.

Avoid the extra emphasis (!) in error messages.

sure

Use seq_len or seq_along instead of 2:ncol(...) in check_read_assignment_stat

OK

Avoid mixing analysis and graphing functions and consider separating functionalities (in check_read_assignment_stat, check_read_distribution, check_sample_correlation, etc.)

Here the plots is the main output. the data generated during the process is not of interest once the plots are generated.

It seems like some of the functions should be using SummarizedExperiment to represent the counts e.g., check_expr_distribution. Consider reducing the cyclomatic complexity of functions by creating a class (with an established metadata slot) that automatically checks the data rather than having multiple checks in the functions (e.g., check_expressed_gene_percentage).

_I will create a S4 class to represt the data output by summariz_reads () (a wrapper of featureCounts()) and salmonres() (a wrapper of tximport()), along with the metadata.

Remove retry_download and use BiocFileCache for downloading files.

sure

Remove dontrun from the examples and use BiocFileCache to retrieve the downloaded external files.

The two big wrapper function create_diagnostic_plots() and correct_for_contaminaiton() are main APIs to make many function calls chained together as a pipeline. To save time, I just used dontrun since the individual functions have their own examples. I am planning to remove the two major pipeline-like wrapper functions from the package.

The arguments for correct_for_contamination seem to be overloaded and could be helped by using a class that already has the metadata information.

Yes, I will do that

Avoid saving PDF or any output files for the user by default. The user should be able to save the output in the desired format (in create_diagnostic_plot).

_Since this in the pipeline-like wrapper function create_diagnosticplots(), I decide to save the plots as PDF if their sizes are not too big, otherwise tiff. To give the users the flexibility, I also return the ggplot objects.

Avoid using vague function names like exploratory_analysis. This may more appropriate for a separate package with a set of functions for EDA.

_I will rename the function to samplesimilarity().

Consider removing the cache_env feature in the package. Interactive analyses will use the .GlobalEnv.

_cache_env is an environment object used to hold download files to avoid repeated downloading for running examples. Yes, I will remove cacheenv, instead use BiocFileCache.

For usability, it would be preferable to avoid syntax such as cache_env$counts_summary$gtf$counts_junction.

This is only used in examples to access elements from nested lists

Where possible, avoid using numeric indices and use names instead (found in gc_bias_correction).

OK

For UCSC2Ensembl, have you considered using seqlevelsStyle to convert the seqlevels instead? The function here might not be necessary.

seqlevelsStyle() can only change the main chromosome names, from chr1, chr2, ... chrX to 1, 2, ..., X, but it can change the names of scaffolds.

Consider moving make_BSgenome and make_ensdb creation functions into a separate package.

Those functions will be removed from the package

tests We strongly recommend adding unit tests to ensure that the code in the package is robust and reliable.

I will add tests

Bioconductor / Contributions

Please help submit my CleanUpRNAseq package to Bioconductor #3442

Version 0.99.0

You will also need an inst/script directory that describes how the data in inst/extdata is created. It can be text, pseudo code or code but should describe how a user would re-create data and contain any licensing/source information.

Please remove CleanUpRNAseq.html. This file should be generated automatically from the RMD during the R CMD build process.

CleanUpRNAseq #3442

DESCRIPTION

NAMESPACE

vignettes

R

tests