Closed js2264 closed 4 years ago
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/periodicDNA
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
I have now managed to make the examples/tests run faster, so the check time is under 10 minutes total. Let me know what the next steps should be. Thanks!
Hi @hpages, Any info on when the review will be done? Best, J
Dear @hpages, Sorry to bother you again. Any info on when the full review would be available? Or are your last comments the only ones you had? Best, J
Dear @hpages, Sorry to insist, but is there anything else to do for the review? I would like to submit a manuscript presenting the package and I need it to be available. It has been more than 10 weeks since the beginning of the submission process. Perhaps the review should be transferred to someone with more time available? Best, J
Hi @js2264 , Thanks for the changes to periodicDNA. I'm taking a deeper look at the package now and will get back to you in the next 24h or so. Thanks for your patience. H.
Hi @js2264 ,
An important effort is needed to improve the documentation. See below.
Best, H.
Are utility functions like checkPeriodFromFourier
or withSeq
useful/meaningful to the end user? I can't figure out what users of the periodicDNA package would need these functions for. Furthermore, in the case of checkPeriodFromFourier
I don't really understand what the function does and have no idea how I would use it in the context of a periodicDNA analysis. If a function is an internal utility (i.e. not intended to be used by the end user), please do not export or document it (see my earlier comment about that). Otherwise please improve its man page, including the examples provided in its man page.
Assuming sampleGRanges
is a function intended for the end user (see above point if it's not), then its man page needs improvements. The current description of what the function does exactly is minimalist and seems inaccurate. In particular it says that the returned GRanges has "the same chromosome, width and strand distributions than the original" but that doesn't seem to be the case:
gr <- GRanges(LETTERS[1:5], IRanges(1:5, width=10), strand=c("+", "+", "-", "-", "*"))
set.seed(333)
sampleGRanges(gr)
# GRanges object with 5 ranges and 0 metadata columns:
# seqnames ranges strand
# <Rle> <IRanges> <Rle>
# [1] B 1-10 +
# [2] C 7-10 -
# [3] D 6-8 -
# [4] D 11 -
# [5] E 4-9 *
# -------
# seqinfo: 5 sequences from an unspecified genome
Also why set these artificial seqlengths on the returned GRanges?
seqlengths(gr2)
# A B C D E
# 10 11 12 13 14
That would not make much sense in general and would likely cause problems downstream (e.g. a findOverlaps()
operation would fail because of incompatible seqlengths between query and subject).
The vignette starts with "Dinucleotide periodicity over a set of Genomic Ranges". Unfortunately this somehow suggests that periodicDNA only handles dinucleotides which I believe is not the case. How about giving a general introduction to the periodicDNA packages first, and then delve into the particular case of dinucleotide periodicity?
Why explaining how periodicDNA works in a separate vignette? This information is very relevant to anybody seriously interested in using the package so a much better place for it is in the main vignette. This is also an opportunity to introduce terms like distogram, to define them precisely, and to use them later thru the rest of the documentation.
I could be misunderstanding how periodicDNA works but it seems to me that the message displayed by getPeriodicity()
-- "Applying Fast Fourier Transform to the vector of distances" -- is incorrect. My understanding is that it applies the FFT to the normalised distogram, which is not the same as the vector of distances.
It's very hard to find the documentation for the extra arguments accepted by getPeriodicity()
. The man page for the function (typically accessed with ?getPeriodicity
) doesn't say anything about these arguments but uses them (e.g. range_spectrum
) in its examples. After a while I realized that these arguments are documented in separate man pages (in ?getPeriodicity.DNAStringSet
and in ?getPeriodicity.GRanges
). However most users will struggle to find these man pages, at best, or won't be able to find them. This problem is usually avoided by documenting the S3 generic and all its methods in the same man page. Another benefit of doing this is that it avoids repeating the same information across all the man pages (e.g. the long Value section) which will make long term maintenance significantly easier.
The other S3 generics (getFPI
, plotAggregateCoverage
, etc..) have the same documentation problem.
Why isn't motif
an argument of the getPeriodicity()
generic itself since all the methods actually need a motif in order to work? It is generally considered better design to put arguments that are common to all methods in the generic.
The motif
argument in ?getPeriodicity.DNAStringSet
is described as "a dinucleotide of interest" and the 6th item in the list returned by the function is described as "The motif object is the dinucleotide being analysed". Unfortunately this again suggests that the function can only handle dinucleotides.
getPeriodicity
returns a list of 7 components but the man page for getPeriodicity
only documents 6.
Generally speaking functions are expected to catch edge cases (or misuse on the part of the user) in a graceful manner, and to display useful error messages. For example it's not immediately clear what's going on here:
motif <- "AAA"
seqs <- rep(DNAStringSet("AATCTAATCTAATCTAATCTAATCTAATCTAA"), 20)
periodicity_result <- getPeriodicity(seqs, motif)
# - Mapping k-mers.
# range_spectrum (1:200) is wider than the maximum distance between pairs of k-mers (-Inf). Try shortening range_spectrum to a smaller range.
# Error in getPeriodicity.DNAStringSet(seqs, motif) :
# In addition: Warning messages:
# 1: In max(dists) : no non-missing arguments to max; returning -Inf
# 2: In max(dists) : no non-missing arguments to max; returning -Inf
# 3: In max(dists) : no non-missing arguments to max; returning -Inf
So please implement some basic checking of the user input and make sure to print informative error messages.
Keep the code chunks in the vignette as simple and easy to understand as possible, preferably sticking to traditional R syntax. For example, this is mostly a matter of taste but I don't find something like
periodicity_result$periodicityMetrics %>% '['(.$Period == 10, )
to be particularly easy to read/understand. The more traditional R syntax is IMO much clearer:
subset(periodicity_result$periodicityMetrics, Period == 10)
Hi @hpages, Thanks for your review. I will start working on it asap. Best, J
Received a valid push; starting a build. Commits are:
a390773 bump version 0.99.18
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/periodicDNA
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Hi @hpages,
I have tried to address all your comments. Briefly:
motif
to the generic function directly. Thanks for the tip!getPeriodicity()
manual.Let me know if there are further changes you'd want me to work on, Thanks!
Hi @js2264 ,
All the changes look good. Package is ready to go. Thanks for your contribution to the project.
Cheers, H.
Your package has been accepted. It will be added to the Bioconductor nightly builds.
Thank you for contributing to Bioconductor!
The master branch of your GitHub repository has been added to Bioconductor's git repository.
To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/js2264.keys is not empty), then no further steps are required. Otherwise, do the following:
See further instructions at
https://bioconductor.org/developers/how-to/git/
for working with this repository. See especially
https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/
to keep your GitHub and Bioconductor repositories in sync.
Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at
https://bioconductor.org/checkResults/
(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("periodicDNA")
. The package 'landing page' will be created at
https://bioconductor.org/packages/periodicDNA
If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
I am familiar with the essential aspects of Bioconductor software management, including:
For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.