Closed hpages closed 2 months ago
Hi @hpages
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: pwalign
Title: Perform pairwise sequence alignments
Description: The two main functions in the package are pairwiseAlignment() and
stringDist(). The former solves (Needleman-Wunsch) global alignment,
(Smith-Waterman) local alignment, and (ends-free) overlap alignment
problems. The latter computes the Levenshtein edit distance or pairwise
alignment score matrix for a set of strings.
biocViews: Alignment, SequenceMatching, Sequencing, Genetics
URL: https://bioconductor.org/packages/pwalign
BugReports: https://github.com/Bioconductor/pwalign/issues
Version: 0.99.0
License: Artistic-2.0
Encoding: UTF-8
Authors@R: c(
person("Patrick", "Aboyoun", role="aut"),
person("Robert", "Gentleman", role="aut"),
person("Hervé", "Pagès", role="cre",
email="hpages.on.github@gmail.com"))
Depends: BiocGenerics, S4Vectors, IRanges, Biostrings (>= 2.71.5)
Imports: methods, utils
LinkingTo: S4Vectors, IRanges, XVector, Biostrings
Enhances: Rmpi
Suggests: RUnit
LazyLoad: yes
Collate: 00datacache.R
utils.R
InDel-class.R
AlignedXStringSet-class.R
PairwiseAlignments-class.R
PairwiseAlignmentsSingleSubject-class.R
PairwiseAlignments-io.R
align-utils.R
pid.R
substitution_matrices.R
pairwiseAlignment.R
stringDist.R
zzz.R
pwalign contains the pairwiseAlignment
-related stuff taken from Biostrings. The plan is to deprecate this stuff in Biostrings (in BioC 3.19), and to redirect the user to the stuff that is now in pwalign. Then to defunct it in Biostrings (in BioC 3.20), and to finally remove it from Biostrings (in BioC 3.21).
The motivations for this split are:
pairwiseAlignments
-related stuff in it adds a lot of complexity to the package (via additional specialized classes, generics, and methods, and a lot of complex C code to support them). This split will make Biostrings about 20% smaller. This in turn will make its maintenance easier and will also make R CMD check
slightly faster.pairwiseAlignment
functionality. So this split won't affect most of Biostrings revdeps. However they will now depend on a lighter Biostrings that will be slightly faster to install (faster to download, compile, and load).pairwiseAlignment
-related stuff from Aidan's plate.H.
I can pass this into building the reports however it will fail until the latest version of Biostrings is available.
Biostrings 2.71.5 (latest version) is already on nebbiolo1 so we should be good to go.
It has not propagated yet https://bioconductor.org/checkResults/devel/bioc-LATEST/Biostrings/
It doesn't need to. It's on the machine.
Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.
IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.
Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): pwalign_0.99.0.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/pwalign
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
The 2 WARNINGs were expected.
One is about using RMarkdown instead of Sweave for the vignette. Note that the vignette was just taken from Biostrings and put in pwalign. You can see it here. It was written a long time ago by Patrick Aboyoun, the original author of the pairwiseAlignment
stuff. Since it contains a lot of mathematical formulae that would be tricky to translate to markdown, I don't intend to make the conversion, at least not for now.
The other WARNING is about "Empty or missing \value sections found in man pages.". This is a false positive that I reported here yesterday.
Let me know if you have questions.
H.
A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.
Hi Hervé, @hpages
Thank you for your submission. Please see the review below.
Best regards, Marcel
useMpi
functionality is
in question. I mostly provided minor notes given that this code is
established and ported over from Biostrings
.LazyLoad
field is now ignored. The replacement LazyData
should
be set to false
or not included. Users should use data(...)
to load a
dataset rather than have them in the .GlobalEnv
.Rnw
file to Rmd
.RTobjs
does not seem to be used anywhere, consider its removal.mismatchSummary,AlignedXStringSet0-method
(and others),weight <- as.integer(weight)
## instead of
if (!is.integer(weight))
weight <- as.integer(weight)
compareStrings,ANY,ANY-method
to coerce both pattern
and subject
inputs
to character and dispatch to the compareStrings,character,character-method
setMethod("compareStrings",
signature = c(pattern = "ANY", subject = "ANY"),
function(pattern, subject) {
compareStrings(as.character(pattern), as.character(subject))
})
useMpi
is disabled. Will it work again or should it be
removed?stringDist
the
default method
argument should be the vector of possibilities i.e.,
c("levenshtein", "hamming", "quality", "substitutionMatrix")
and match.arg
will ensure that one of them is selectedBiostrings
to exported functions rather
than using :::
(in R/utils.R
).AlignedXStringSet-class.R
:> covr::package_coverage(type = "all")
pwalign Coverage: 76.46%
R/AlignedXStringSet-class.R: 0.00%
R/align-utils.R: 38.89%
R/PairwiseAlignmentsSingleSubject-class.R: 39.22%
R/pairwiseAlignment.R: 49.49%
R/zzz.R: 50.00%
R/00datacache.R: 66.67%
R/stringDist.R: 71.76%
R/PairwiseAlignments-class.R: 72.22%
R/PairwiseAlignments-io.R: 86.01%
R/substitution_matrices.R: 87.76%
src/align_pairwiseAlignment.c: 89.47%
src/align_utils.c: 99.48%
R/InDel-class.R: 100.00%
src/R_init_pairwiseAlignment.c: 100.00%
Received a valid push on git.bioconductor.org; starting a build for commit id: be1b36bf4fe64419cbcc64b9316c823fd576bb51
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on the Bioconductor Single Package Builder.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the build report for more details.
The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): pwalign_0.99.1.tar.gz
Links above active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/pwalign
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Thanks Marcel for the feedback.
Note that
LazyLoad
field is now ignored.
Removed.
Consider converting the
Rnw
file toRmd
.
See my previous comment from 2 weeks ago above.
RTobjs
does not seem to be used anywhere, consider its removal.
Removed.
if (x is not the thing we want) x <- turn_x_into_the_thing_we_want(x)
I prefer that idiom over an unconditional x <- turn_x_into_the_thing_we_want(x)
. That's because even if x
is already the thing we want, sadly turn_x_into_the_thing_we_want()
is not guaranteed to be a no-op. For example as.character(x)
will drop the names of character vector x
, and as(x, "A")
might transform x
even if is(x, "A")
is TRUE
. It might matter (e.g. when the object is big and turn_x_into_the_thing_we_want(x)
triggers a copy) or not (like here).
FWIW this has hit me a few times in the past so I got into the habit of systematically using the if (x is not the thing we want) x <- turn_x_into_the_thing_we_want(x)
idiom without even thinking about it.
Minor: To avoid repetition, perhaps use a default
compareStrings,ANY,ANY-method
etc...
I simplified the compareStrings()
methods a bit. Minor disavantage of a compareStrings,ANY,ANY-method
that blindly coerces anything you throw at it to character is that it might do some weird/unexpected things for some exotic stuff. And the error that will result in that case will probably not be of great help to the end user.
It looks like
useMpi
is disabled. Will it work again or should it be removed?
I disabled this. Rmpi has been in Enhances
(as opposed to Suggests
) for the last 15 years or so, and the way things are implemented in pairwiseAlignments()
is that it will be used only if the user explicitly loads it before calling the function. This means that the useMpi
mode has not been tested on the daily builds for the last 15 years. Furthermore, since this is an undocumented feature, I suppose that nobody has ever used it, except Patrick. Last but not least: it's not covered by the unit tests either.
I might re-enable it at some point in the not too distant future but some serious testing will be required first. Also, this predates BiocParallel so the Rmpi approach might be completely obsolete, I don't know. Will need to revisit, test, assess, and decide what to do with it.
Disclaimer: I've never used Rmpi myself (Patrick Aboyoun implemented this) .
The arguments should list all possible options e.g., in
stringDist()
Usually yes. I think maybe the reason Patrick didn't do it in this case is that the list of all possible values for the method
argument is a little bit long (c("levenshtein", "hamming", "quality", "substitutionMatrix")
) so it could be ugly to see such a long list in the definition of the S4 generic and all its methods, especially in the \usage
section of the man page. Also maybe not all the stringDist()
methods might support all these options at the moment, or future methods might want to support different options.
As long as the man page for stringDist()
lists all the supported method
's I can live with that.
Consider promoting functions from Biostrings to exported functions rather than using
:::
My understanding is that this is acceptable when the upstream and client packages have the same maintainer, which is why R CMD check
doesn't say anything in that case.
H.
Hi Hervé, @hpages Thanks for making those changes. The package has been accepted. Best regards, Marcel
Your package has been accepted. It will be added to the Bioconductor nightly builds.
Thank you for contributing to Bioconductor!
Reviewers for Bioconductor packages are volunteers from the Bioconductor community. If you are interested in becoming a Bioconductor package reviewer, please see Reviewers Expectations.
The default branch of your GitHub repository has been added to Bioconductor's git repository as branch devel.
To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/hpages.keys is not empty), then no further steps are required. Otherwise, do the following:
See further instructions at
https://bioconductor.org/developers/how-to/git/
for working with this repository. See especially
https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/
to keep your GitHub and Bioconductor repositories in sync.
Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at
https://bioconductor.org/checkResults/
(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("pwalign")
. The package 'landing page' will be created at
https://bioconductor.org/packages/pwalign
If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
[x] I am familiar with the Bioconductor code of conduct and agree to abide by it.
I am familiar with the essential aspects of Bioconductor software management, including:
For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.