Closed ZW-xjtlu closed 2 years ago
Hi @ZW-xjtlu
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: RegionPropertiesFeatures
Type: Package
Title: Extraction of Region Properties Features from Interval-based Genomic Data
Version: 0.99.7
Authors@R: person("Zhen", "Wei", email = "zhen.wei10@icloud.com",
role = c("aut", "cre"))
Description: Extract predictive genome-derived features from genomic regions. RegionPropertiesFeatures aims to facilitate the feature engineering process for predictive modeling of genomic data. The package can enumerate a large number of genomic features through the combinations between genomic properties (e.g., length, sequence contents, clustering metrics, conservation scores, distance toward ends, ect.) and genomic regions (e.g. exons, introns, promoters, transcripts, genes, 5'UTR, CDS, 3'UTR ect.). Compared with only using the sequence-derived features, adding comprehensive features of region properties can significantly improve the predictive performance over various end applications in genomics. In addition, the interpretations of regional properties are highly intuitive, so that the biological insights can be easily derived through the feature importance analysis.
License: LGPL (>= 3)
Imports: BiocGenerics, Biostrings, GenomeInfoDb, GenomicRanges, GenomicScores, BSgenome, IRanges, ensembldb, matrixStats, methods, GenomicFeatures, gtools, stats, S4Vectors, utils
Depends: R (>= 4.0)
Suggests: knitr, rmarkdown, ggplot2, BSgenome.Hsapiens.UCSC.hg19, TxDb.Hsapiens.UCSC.hg19.knownGene
VignetteBuilder: knitr
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.1.0
biocViews: FunctionalPrediction, FeatureExtraction, GenomeAnnotation, FunctionalGenomics,
Classification, Clustering, Regression
NeedsCompilation: no
A reviewer has been assigned to your package. Learn what to expect during the review process.
IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.
Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "skipped, TIMEOUT, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/RegionPropertiesFeatures
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 19640fb947b0d1ca3ae4ed0581d209d153c8c2d9
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR, skipped, TIMEOUT". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/RegionPropertiesFeatures
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 4985692b1332265df4297e71454ea6c696652e49
Received a valid push on git.bioconductor.org; starting a build for commit id: fc62e11460036248d7ee911abbe5e99a7d576ae2
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR, skipped, TIMEOUT". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/RegionPropertiesFeatures
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR, skipped, TIMEOUT". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/RegionPropertiesFeatures
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Thank you for your submission. Please see the following review:
Build Report
[ ] Remove system files from being git tracked
[ ] Consider shortening line lengths. Please try to keep to an 80 character limit as the code will otherwise be difficult to read when displayed online. One suggested alternative is to include linebreaks after function arguments.
[ ] Please update R version to 4.1
[ ] Please fix coding practice NOTES
[ ] Please use donttest instead of dontrun in man pages
[ ] All NOTES and Warnings should be addressed unless justification provided.
R CMD build
DESCRIPTION
README
inst
inst/doc
. It seems based on glancing at the vignette, these files should be made to be interactively generated and be in the vignettes directory.test
Vignette
[ ] Please remove Vignette_V_1.0.0.R
. Only the Rmd
file should be included in the directory
[ ] We would recommend changing the name. Vignette_V_1.0.0 is not very intuitive for a user to find the vignette. We would encourage the vignette to be the same name as the package RegionPropertiesFeatures.Rmd
so that a naive user could do vignette("RegionPropertiesFeatures")
without having to know anything about your package.
[ ] Please show the necessary library calls to run the code in the quick start. It seems like you hid an entire section of code that loads the libraries and assigns to values, as well as sets a seed and creates a GRanges object. All this code should be shown to the user so they can execute the code in the Quick Start.
[ ] The vignette should not redirect to a previously generated static document. The vignette of the package should have the main documentation for the package and executable code so package functionality is displayed and tested upon build/check of the package.
[ ] Until the other documentation is moved to an interactively generated vignette, I will not evaluate. The current vignette is inadequate in describing and executing package functionality. Bioconductor does not all static vignettes; in exceptions we will allow as supplemental material to a main vignette but as mentioned the current package vignette should be expanded.
man
?RegionPropertiesFeatures
. It could be minimal and redirect to main documentation for main function of the package.R
[ ] Please use message
instead of cat
. message
also does a paste0
automatically so you would not have to include a paste0. sequential calls should be consolidated into one call instead of having repeated cat/message
both of efficiency and for easy of reading.
[ ] for loops are generally inefficient, please consider changing to vapply
[ ] In generally EnumerateRegionFeatures seems like it could be condensed into a helper function of some sort for each of these sections as each section has very similar structure and calls.
Please address the above issues. When ready please do another version bump to trigger a new build and comment back here with updates and comments and to request a re-review. Cheers
Thank you for your submission. Please see the following review:
Build Report
- [x] Remove system files from being git tracked
- [x] Consider shortening line lengths. Please try to keep to an 80 character limit as the code will otherwise be difficult to read when displayed online. One suggested alternative is to include linebreaks after function arguments.
- [x] Please update R version to 4.1
- [x] Please fix coding practice NOTES
- [x] Please use donttest instead of dontrun in man pages
- [ ] All NOTES and Warnings should be addressed unless justification provided.
R CMD build
- [x] produces the vignette to pop up interactively. This should not occur. The static pdf that is utilized should be transferred to the vignette of the package.
DESCRIPTION
- [x] phastCons100way.UCSC.hg19 is used in vignette and therefore should be in the Suggest of the description.
README
- [x] Please also include instructions to install from Bioconductor
inst
- [x] You should not need
inst/doc
. It seems based on glancing at the vignette, these files should be made to be interactively generated and be in the vignettes directory.test
- [x] We highly recommend implementing tests to ensure the functions are operating as expected, especially for all the extraction methods implemented.
Vignette
- [x] Please remove
Vignette_V_1.0.0.R
. Only theRmd
file should be included in the directory- [x] We would recommend changing the name. Vignette_V_1.0.0 is not very intuitive for a user to find the vignette. We would encourage the vignette to be the same name as the package
RegionPropertiesFeatures.Rmd
so that a naive user could dovignette("RegionPropertiesFeatures")
without having to know anything about your package.- [x] Please show the necessary library calls to run the code in the quick start. It seems like you hid an entire section of code that loads the libraries and assigns to values, as well as sets a seed and creates a GRanges object. All this code should be shown to the user so they can execute the code in the Quick Start.
- [x] The vignette should not redirect to a previously generated static document. The vignette of the package should have the main documentation for the package and executable code so package functionality is displayed and tested upon build/check of the package.
- [x] Until the other documentation is moved to an interactively generated vignette, I will not evaluate. The current vignette is inadequate in describing and executing package functionality. Bioconductor does not all static vignettes; in exceptions we will allow as supplemental material to a main vignette but as mentioned the current package vignette should be expanded.
man
- [x] We recommend having a package level man page so that a naive user could do
?RegionPropertiesFeatures
. It could be minimal and redirect to main documentation for main function of the package.R
- [x] Please use
message
instead ofcat
.message
also does apaste0
automatically so you would not have to include a paste0. sequential calls should be consolidated into one call instead of having repeatedcat/message
both of efficiency and for easy of reading.- [ ] for loops are generally inefficient, please consider changing to vapply
- [ ] In generally EnumerateRegionFeatures seems like it could be condensed into a helper function of some sort for each of these sections as each section has very similar structure and calls.
Please address the above issues. When ready please do another version bump to trigger a new build and comment back here with updates and comments and to request a re-review. Cheers
Dear Ishep,
Thanks for your comments, I will try to implement these soon.
Received a valid push on git.bioconductor.org; starting a build for commit id: 276510ab5ca883423c9205623e9d4dbf5c1a2310
Thank you for your submission. Please see the following review: Build Report
- [x] Remove system files from being git tracked
- [x] Consider shortening line lengths. Please try to keep to an 80 character limit as the code will otherwise be difficult to read when displayed online. One suggested alternative is to include linebreaks after function arguments.
- [x] Please update R version to 4.1
- [ ] Please fix coding practice NOTES
- [ ] Please use donttest instead of dontrun in man pages
- [ ] All NOTES and Warnings should be addressed unless justification provided.
R CMD build
- [ ] produces the vignette to pop up interactively. This should not occur. The static pdf that is utilized should be transferred to the vignette of the package.
DESCRIPTION
- [ ] phastCons100way.UCSC.hg19 is used in vignette and therefore should be in the Suggest of the description.
README
- [ ] Please also include instructions to install from Bioconductor
inst
- [ ] You should not need
inst/doc
. It seems based on glancing at the vignette, these files should be made to be interactively generated and be in the vignettes directory.test
- [ ] We highly recommend implementing tests to ensure the functions are operating as expected, especially for all the extraction methods implemented.
Vignette
- [ ] Please remove
Vignette_V_1.0.0.R
. Only theRmd
file should be included in the directory- [ ] We would recommend changing the name. Vignette_V_1.0.0 is not very intuitive for a user to find the vignette. We would encourage the vignette to be the same name as the package
RegionPropertiesFeatures.Rmd
so that a naive user could dovignette("RegionPropertiesFeatures")
without having to know anything about your package.- [ ] Please show the necessary library calls to run the code in the quick start. It seems like you hid an entire section of code that loads the libraries and assigns to values, as well as sets a seed and creates a GRanges object. All this code should be shown to the user so they can execute the code in the Quick Start.
- [ ] The vignette should not redirect to a previously generated static document. The vignette of the package should have the main documentation for the package and executable code so package functionality is displayed and tested upon build/check of the package.
- [ ] Until the other documentation is moved to an interactively generated vignette, I will not evaluate. The current vignette is inadequate in describing and executing package functionality. Bioconductor does not all static vignettes; in exceptions we will allow as supplemental material to a main vignette but as mentioned the current package vignette should be expanded.
man
- [ ] We recommend having a package level man page so that a naive user could do
?RegionPropertiesFeatures
. It could be minimal and redirect to main documentation for main function of the package.R
- [ ] Please use
message
instead ofcat
.message
also does apaste0
automatically so you would not have to include a paste0. sequential calls should be consolidated into one call instead of having repeatedcat/message
both of efficiency and for easy of reading.- [ ] for loops are generally inefficient, please consider changing to vapply
- [ ] In generally EnumerateRegionFeatures seems like it could be condensed into a helper function of some sort for each of these sections as each section has very similar structure and calls.
Please address the above issues. When ready please do another version bump to trigger a new build and comment back here with updates and comments and to request a re-review. Cheers
Dear Ishep,
Thanks for your comments, I will try to implement these soon.
All of the tickled terms are implemented, however, I have difficulty adopting a small number of comments. The call on the function EnumerateRegionFeatures() is difficult to reduce since different regions from txdb have different parameter settings. Also, these regions are big in size, it is better to delete them after the extraction on them is completed.
Vapply is hard to completely adopt for this case because the shape of the output is not regular enough. I tried my best change the code style (and learned a lot), and I thank for the reviewer for her useful comments.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/RegionPropertiesFeatures
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Received a valid push on git.bioconductor.org; starting a build for commit id: 86ab5ed0e5189abaddef04d37a34ee65620e90ed
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/RegionPropertiesFeatures
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Thank you for the changes thus far. A few more comments/concerns/questions:
[ ] Your main function genomeDerivedFeatures
is not interactively run once in either the man pages or the vignette. This is very problematic in that there is no current way to interactively tell if the function is functioning properly. It should be run at least once somewhere in the code. Similarly for sampleSequence
the code should be run once either in the vignette or in the
examples. Especially since in the vignette that you say "as the correct sampling of negative data is often crucial for the success of a data science project in genomics"; if they utilize your function you should be running it somewhere as a test. If the examples chosen take too long to run you should consider using a smaller dataset or a subset for testing.
[ ] Would it be useful to allow the user to choose the features they wish to extract. Perhaps not doing the entire pipeline of extractions but say only those for "properties of exonic CDS" or only "properties of introns". This would also then allow the function to be run in a smaller more manageable chunk.
[ ] You might want to reconsider the printing style you have chosen. Depending on the window size, the output is not very appealing or helpful since it is a long fixed width. Attached screenshot of me running a command on a smaller window:
[ ] It seems like you take in and utilized GRanges extensively which is fantastic. I'm wondering why the output is not also a GRanges function? As it is a dataframe of the same dimensions of the input X it seems like the function should be endomorphic and return a GRanges object?
Thank you. Please make changes or respond to the above concerns. Comment back here when ready.
The release is fast approaching. May we expect updates and comments soon?
This issue is being closed because there has been no progress for an extended period of time. You may reopen the issue when you have the time to actively participate in the review / submission process. Please also keep in mind that a package accepted to Bioconductor requires a commitment on your part to ongoing maintenance.
Thank you for your interest in Bioconductor.
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
[x] I am familiar with the Bioconductor code of conduct and agree to abide by it.
I am familiar with the essential aspects of Bioconductor software management, including:
For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.