Closed haibol2016 closed 1 month ago
Hi @haibol2016
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: CleanUpRNAseq
Type: Package
Title: CleanUpRNAseq detects and corrects for genomic DNA contamination
in RNA-seq data
Version: 0.99.0
Authors@R: c(person("Haibo", "Liu",
role = c("aut", "cre"),
comment=c(ORCID="0000-0002-4213-2883"),
email = "haibo.liu@umassmed.edu"),
person("Kevin", "O'Connor", role = "ctb",
email = "Kevin.O'Connor@umassmed.edu"),
person("Michelle", "Kelliher", role = "ctb",
email = "Michelle.Kelliher@umassmed.edu"),
person("Lihua Julie", "Zhu", role = "aut",
email = "Julie.Zhu@umassmed.edu"),
person("Kai", "Hu", role = "aut",
email = "kai.hu@umassmed.edu"))
Description: RNA-seq data generated by some library preparation methods, such
as rRNA-depletion-based method and the SMART-seq method, might be
contaminated by genomic DNA (gDNA), if DNase I disgestion is not
performed properly during RNA preparation. CleanUpRNAseq is
developed to check if RNA-seq data is suffered from gDNA
contamination. If so, it can perform correction for gDNA
contamination and reduce false discovery rate of differentially
expressed genes.
biocViews: QualityControl, Sequencing, GeneExpression
License: GPL-3
Imports: ggplot2, plyranges, DESeq2, Rsamtools, Rsubread, pheatmap, reshape2,
GenomicRanges, edgeR, ensembldb, ggrepel, tximport, utils, magrittr,
grDevices, stats, graphics, Biostrings,SummarizedExperiment, methods,
devtools, BSgenome,qsmooth, AnnotationFilter, patchwork,
RColorBrewer, GenomeInfoDb, BiocGenerics, limma
Depends: R (>= 4.0.0)
Suggests: BiocManager, rmarkdown, RUnit, knitr, EnsDb.Hsapiens.v86, ggplotify,
BiocStyle, R.utils, AnnotationHub, BSgenome.Hsapiens.UCSC.hg38,
VignetteBuilder: knitr
RoxygenNote: 7.3.1
Roxygen: list(markdown = TRUE)
Encoding: UTF-8
LazyData: true
How long will it take to get a package submitted?
I moderate packages normally once a week. Once moderated, if a package passes the pre-check, it will build/check on our system and will require a clean build/check to be assigned a reviewer. If there were pre-review concerns those should be addressed before it will be put on our system. Reviewers generally respond in 1-2 week time frame. Any concerns should be addressed and then once accepted we will add to the Bioconductor public repository.
# Version 0.99.0
* initial submission to Bioconductor
[ ] You will also need an inst/script directory that describes how the data in inst/extdata is created. It can be text, pseudo code or code but should describe how a user would re-create data and contain any licensing/source information.
[ ] Please remove CleanUpRNAseq.html. This file should be generated automatically from the RMD during the R CMD build process.
[ ] Data can not be stored on a dropbox or google drive. Please move to a secure location preferrably zenodo but an institutional server or ExperimentHub would also suffice.
All the listed issues are fixed. Thank you.
Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.
IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.
Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: ERROR before build products produced.
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CleanUpRNAseq
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Hello, I ran devtools::check() on my Windows platform a moment ago. I didn't encounter any error, though. Could you let me know the platform information where the package was checked? Based on th error report, I can't figure out what caused the error.
Thank you.
Haibo
The platform information is on the report: https://bioconductor.org/spb_reports/CleanUpRNAseq_buildreport_20240607124952.html#nebbiolo2_buildsrc_anchor
Linux (Ubuntu 22.04.3 LTS)/x86_64
The error is pretty clear -- a function is called with argument maxMOp, but the function doesn't want such an argument 'CleanUpRNAseq.Rmd' failed with diagnostics: unused argument (maxMOp = 10)
maybe it is a spelling error or the function's requirements changed
Thank you for you reply. However, in the Vignette Line 194-219, there is no such argument at all: https://github.com/haibol2016/CleanUpRNAseq/blob/devel/vignettes/CleanUpRNAseq.Rmd#L194-L219.
Haibo
>
invisible({
capture.output({
counts_list <- summarize_reads(
metadata = metadata,
isPairedEnd = TRUE,
strandSpecific = 0,
saf_list = saf_list,
gtf = gsub(".gz", "", gtf),
threads = 1,
verbose = FALSE
)
})
})
Error in featureCounts(files = bamfiles, annot.ext = annotation, isGTFAnnotationFile = isGTF, :
unused argument (maxMOp = 10)
Reproducible. While this is not directly in this code I think it is a cascade affect from Rsubread::featureCounts
removing this argument recently. You are still using that argument in your summarize_reads which eventually gets called through this code in your vignette
Got it. I will correct it. Thank you.
Haibo
On Mon, Jun 10, 2024 at 8:22 AM lshep @.***> wrote:
invisible({ capture.output({ counts_list <- summarize_reads( metadata = metadata, isPairedEnd = TRUE, strandSpecific = 0, saf_list = saf_list, gtf = gsub(".gz", "", gtf), threads = 1, verbose = FALSE ) }) }) Error in featureCounts(files = bamfiles, annot.ext = annotation, isGTFAnnotationFile = isGTF, : unused argument (maxMOp = 10)
Reproducible. While this is not directly in this code I think it is a cascade affect from Rsubread::featureCounts removing this argument https://code.bioconductor.org/search/search?q=maxMOp recently. You are still using that argument in your summarize_reads which eventually gets called through this code in your vignette
— Reply to this email directly, view it on GitHub https://github.com/Bioconductor/Contributions/issues/3442#issuecomment-2158201671, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALR5WD36RR6632WTXO5UUDLZGWLBTAVCNFSM6AAAAABIGWPPH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJYGIYDCNRXGE . You are receiving this because you were mentioned.Message ID: @.***>
Thank you. I just fixed the bug.
Haibo
On Mon, Jun 10, 2024 at 1:07 PM Haibo Liu @.***> wrote:
Got it. I will correct it. Thank you.
Haibo
On Mon, Jun 10, 2024 at 8:22 AM lshep @.***> wrote:
invisible({ capture.output({ counts_list <- summarize_reads( metadata = metadata, isPairedEnd = TRUE, strandSpecific = 0, saf_list = saf_list, gtf = gsub(".gz", "", gtf), threads = 1, verbose = FALSE ) }) }) Error in featureCounts(files = bamfiles, annot.ext = annotation, isGTFAnnotationFile = isGTF, : unused argument (maxMOp = 10)
Reproducible. While this is not directly in this code I think it is a cascade affect from Rsubread::featureCounts removing this argument https://code.bioconductor.org/search/search?q=maxMOp recently. You are still using that argument in your summarize_reads which eventually gets called through this code in your vignette
— Reply to this email directly, view it on GitHub https://github.com/Bioconductor/Contributions/issues/3442#issuecomment-2158201671, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALR5WD36RR6632WTXO5UUDLZGWLBTAVCNFSM6AAAAABIGWPPH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJYGIYDCNRXGE . You are receiving this because you were mentioned.Message ID: @.***>
You need to push to bioconductor and trigger a new build to continue. Please see this previous message
Received a valid push on git.bioconductor.org; starting a build for commit id: 45af654bc4f15f7b87850d500d95ea28705a10b3
Thank you, lshep. I finally figured out how to do that. It is first time to sumbit a package to Bioconductor. If feel it is not easy to follow the direction here [this documentation](http://contributions.bioconductor.org/git-version-control.html#new-package-workflow].
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: ERROR before build products produced.
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CleanUpRNAseq
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: f4a109f20d392c32d9ec4f89c9f2b97035226a28
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.2.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CleanUpRNAseq
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 26fad64a3679b698a04fba5aca412e0bebe88351
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.3.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CleanUpRNAseq
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 8629168df8aadbbca19c6da54597d1d334f91fdc
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.4.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CleanUpRNAseq
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: e294023b2ad17fbcea215ac700d3d6cbfea36777
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.5.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CleanUpRNAseq
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 52d04cbae3b0046d2b9785972cfdde5462f29733
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.6.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CleanUpRNAseq
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 78ae3a757a70a2710693d95b9f5c0c8a57c5ebde
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.7.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CleanUpRNAseq
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: fb63e1105a70925659911bb2515af7bc9c149347
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.9.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CleanUpRNAseq
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 4ba2a247b59a57fd75b35da9ed26aa19179d9021
Received a valid push on git.bioconductor.org; starting a build for commit id: 71192a90d87285567bea3b0a65cb720b42a66b42
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.10.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CleanUpRNAseq
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.11.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CleanUpRNAseq
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Hello,
My package is now error-free but with a warning message, "WARNING: R
CMD check exceeded 10 min requirement". This is because many examples are run. Is this OK?
Thank you,
Haibo
On Mon, Jun 3, 2024 at 10:58 AM lshep @.***> wrote:
I moderate packages normally once a week. Once moderated, if a package passes the pre-check, it will build/check on our system and will require a clean build/check to be assigned a reviewer. If there were pre-review concerns those should be addressed before it will be put on our system. Reviewers generally respond in 1-2 week time frame. Any concerns should be addressed and then once accepted we will add to the Bioconductor public repository.
- Please don't have an empty news file. You can initiate with something like
Version 0.99.0
initial submission to Bioconductor
-
You will also need an inst/script directory that describes how the data in inst/extdata is created. It can be text, pseudo code or code but should describe how a user would re-create data and contain any licensing/source information.
Please remove CleanUpRNAseq.html. This file should be generated automatically from the RMD during the R CMD build process.
Data can not be stored on a dropbox or google drive. Please move to a secure location preferrably zenodo but an institutional server or ExperimentHub would also suffice.
— Reply to this email directly, view it on GitHub https://github.com/Bioconductor/Contributions/issues/3442#issuecomment-2145421169, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALR5WD3IMDEIDK3FWWNYUT3ZFSABDAVCNFSM6AAAAABIGWPPH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBVGQZDCMJWHE . You are receiving this because you were mentioned.Message ID: @.***>
A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.
Hello, My package is now error-free but with a warning message, "WARNING: R CMD check exceeded 10 min requirement". This is because many examples are run. Is this OK? Thank you, Haibo
Hi @haibol2016
Please make sure that the package examples do not exceed the time limit. Use small example datasets to demonstrate the functionality rather than biological accuracy.
Best, Marcel
Hello Marcel,
Is this really a dead rule to keep the R CMD check time < 10 min? The manuscript reporting this package has been accepted by a journal and we are expected to submit the proofread this week. We need a Bioconductor link to show that the package has been accepted by Bioconductor. If the issue of exceeding 10 min runtime is not that important and not other problems detected, could you accept the package as it is? I will try my best to change the examples to shorten the R CMD check time shorten than 10 min as soon as possible.
Thank you,
Haibo
Hi Haibo, @haibol2016 Yes, due to the high costs of running our systems, we would like that all packages conform to this rule. Best regards, Marcel
Received a valid push on git.bioconductor.org; starting a build for commit id: c937557adfaf3d3721234824feb5ed6b45b09d79
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: macOS 12.7.1 Monterey: CleanUpRNAseq_0.99.12.tar.gz Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.12.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CleanUpRNAseq
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 5ef7ce8cae95ac124005b1e98d0fe1dcd8ee1d6c
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: macOS 12.7.1 Monterey: CleanUpRNAseq_0.99.13.tar.gz Linux (Ubuntu 22.04.3 LTS): CleanUpRNAseq_0.99.13.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CleanUpRNAseq
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Hello Marcel,
I fixed the timeout issue. Please review the package again.
Best,
HAibo
Hi Haibo, @haibol2016
Thank you for your contribution. Please see the review below.
Best regards, Marcel
The package is quite extensive and has a lot of functionality. It would be
good to consider splitting the package into smaller packages with more focused
functionality. Currently, it tries to do too much, i.e. data processing,
statistical metrics, plotting, BSgenome forging, etc. Consider also
contributing to existing packages such as BSgenomeForge
and others.
BugReports
and URL
fieldsBiocManager
from Suggests
fieldDepends
field to R (>= 4.4.0)
LazyData
field or set to false
desc::desc
to reformat the DESCRIPTION
file.UCSC2Ensembl
may require a more descriptive nameforge_BSgenome
, make_BSgenome
,
make_ensdb
) should be in a separate package?make_ensdb
BiocFileCache
for downloading files
rather than retry_download
invisible
and capture.output
in the vignetteprint()
in the vignettepng
files from the vignette as they are generated by the
functions in the package.BiocFileCache
for downloading files in the examples!
) in error messages.seq_len
or seq_along
instead of 2:ncol(...)
in
check_read_assignment_stat
check_read_assignment_stat
, check_read_distribution
,
check_sample_correlation
, etc.)SummarizedExperiment
to
represent the counts e.g., check_expr_distribution
.metadata
slot) that automatically checks the data
rather than having multiple checks in the functions (e.g.,
check_expressed_gene_percentage
).retry_download
and use BiocFileCache
for downloading files.dontrun
from the examples and use BiocFileCache
to retrieve
the downloaded external files.correct_for_contamination
seem to be overloaded and could
be helped by using a class that already has the metadata information.create_diagnostic_plot
).exploratory_analysis
. This may more
appropriate for a separate package with a set of functions for EDA.cache_env
feature in the package. Interactive
analyses will use the .GlobalEnv
.cache_env$counts_summary$gtf$counts_junction
.gc_bias_correction
).UCSC2Ensembl
, have you considered using seqlevelsStyle
to convert the
seqlevels
instead? The function here might not be necessary.make_BSgenome
and make_ensdb
creation functions into a
separate package.Hello Marcel,
Thank you very much for the detailed review and suggestions. This is my first full-blown R package. I know I still can improve the package. However, given that our manuscript has been accepted and we have to submit the proofread in a short time, I don't think it is possible to overhaul the package at this moment.
Is it possible to accept the package as it is this moment? Otherwise, I will have to remove "Bioconductor" from the title of the manuscrip: "CleanUpRNAseq: An R/Bioconductor Package for Detecting and Correcting DNA Contamination in RNA-Seq Data".
Thank you,
Haibo
Hi Haibo, @haibol2016
Sorry to hear that and apologies for the delay in the review.
Unfortunately for now, we cannot accept the package as-is but we are happy to accept the package after consideration of the review and the appropriate changes.
Best regards, Marcel
Hello Marcel, For time reason, I may not to fix all the issues you pointed out. Do I have to address all of the issues to my package to be considered by Bioconductor? Otherwise, could you please let me know what are compulsory and what are suggestion?
Thanks,
Haibo
Hi @haibol2016 Yes, please respond to the review before proceeding. Best, Marcel
Hello Marcel, I will respond to your comment below each of them in italics.
The package is quite extensive and has a lot of functionality. It would be good to consider splitting the package into smaller packages with more focused functionality. Currently, it tries to do too much, i.e. data processing, statistical metrics, plotting, BSgenome forging, etc. Consider also contributing to existing packages such as BSgenomeForge and others.
I have decided to move the helper fuctions for creating BSgenome from a multifasta file out of the package.
DESCRIPTION add BugReports and URL fields Remove BiocManager from Suggests field Update Depends field to R (>= 4.4.0) Remove LazyData field or set to false Consider using desc::desc to reformat the DESCRIPTION file.
These issues in the DESCRIPTION file have been easily fixed using desc::desc. Thank you for the suggested tool.
NAMESPACE UCSC2Ensembl may require a more descriptive name Perhaps helper functions (e.g., forge_BSgenome, make_BSgenome, make_ensdb) should be in a separate package? Consider using shorter function names
_make_ensdb is deleted because it is a shallow wrapper function. forge_BSgenome and make_BSgenome and the other related to create a BSgenome from a multifasta file have been removed from the package.
The UCSC2Ensembl function is renamed as styleBSgenome.
vignettes Consider removing the lightweight wrappers from the vignette and pacakge, e.g., make_ensdb
_makeensdb function is deleted
Enable caching in the vignette and use BiocFileCache for downloading files rather than retry_download
Yes, I will use BiocFileCache for downloading files.
Avoid invisible and capture.output in the vignette
I used these functions so that the large amount of information output by featureCounts will not pollute the vignette. I can remove those functions if it is OK with lots of irrelevant information are captured in the vignette.
No need to explicitly use print() in the vignette
print() is removed
Are the plots provided by the functions in the package? If so, consider removing the saved png files from the vignette as they are generated by the functions in the package.
Because the time limit of 10 mins, I saved the png files to the vignette directory from a interactive run of the R chuncks to shorten the runtime.
R Use BiocFileCache for downloading files in the examples
Yes, BioCFileCache is a very good package to meet my purpose here.
Avoid the extra emphasis (!) in error messages.
sure
Use seq_len or seq_along instead of 2:ncol(...) in check_read_assignment_stat
OK
Avoid mixing analysis and graphing functions and consider separating functionalities (in check_read_assignment_stat, check_read_distribution, check_sample_correlation, etc.)
Here the plots is the main output. the data generated during the process is not of interest once the plots are generated.
It seems like some of the functions should be using SummarizedExperiment to represent the counts e.g., check_expr_distribution. Consider reducing the cyclomatic complexity of functions by creating a class (with an established metadata slot) that automatically checks the data rather than having multiple checks in the functions (e.g., check_expressed_gene_percentage).
_I will create a S4 class to represt the data output by summariz_reads () (a wrapper of featureCounts()) and salmonres() (a wrapper of tximport()), along with the metadata.
Remove retry_download and use BiocFileCache for downloading files.
sure
Remove dontrun from the examples and use BiocFileCache to retrieve the downloaded external files.
The two big wrapper function create_diagnostic_plots() and correct_for_contaminaiton() are main APIs to make many function calls chained together as a pipeline. To save time, I just used dontrun since the individual functions have their own examples. I am planning to remove the two major pipeline-like wrapper functions from the package.
The arguments for correct_for_contamination seem to be overloaded and could be helped by using a class that already has the metadata information.
Yes, I will do that
Avoid saving PDF or any output files for the user by default. The user should be able to save the output in the desired format (in create_diagnostic_plot).
_Since this in the pipeline-like wrapper function create_diagnosticplots(), I decide to save the plots as PDF if their sizes are not too big, otherwise tiff. To give the users the flexibility, I also return the ggplot objects.
Avoid using vague function names like exploratory_analysis. This may more appropriate for a separate package with a set of functions for EDA.
_I will rename the function to samplesimilarity().
Consider removing the cache_env feature in the package. Interactive analyses will use the .GlobalEnv.
_cache_env is an environment object used to hold download files to avoid repeated downloading for running examples. Yes, I will remove cacheenv, instead use BiocFileCache.
For usability, it would be preferable to avoid syntax such as cache_env$counts_summary$gtf$counts_junction.
This is only used in examples to access elements from nested lists
Where possible, avoid using numeric indices and use names instead (found in gc_bias_correction).
OK
For UCSC2Ensembl, have you considered using seqlevelsStyle to convert the seqlevels instead? The function here might not be necessary.
seqlevelsStyle() can only change the main chromosome names, from chr1, chr2, ... chrX to 1, 2, ..., X, but it can change the names of scaffolds.
Consider moving make_BSgenome and make_ensdb creation functions into a separate package.
Those functions will be removed from the package
tests We strongly recommend adding unit tests to ensure that the code in the package is robust and reliable.
I will add tests
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
[X] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[X ] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[ X] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.
[X ] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[X ] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[ X] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
[X ] I am familiar with the Bioconductor code of conduct and agree to abide by it.
I am familiar with the essential aspects of Bioconductor software management, including:
For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.