Closed JokingHero closed 7 years ago
Hi @JokingHero
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: amplican
Type: Package
Title: fast and precise analysis of CRISPR experiments
Description: `amplican` creates reports of deletions, insertions, frameshifts,
cut rates and other metrics in user selected format (preffered html). `amplican`
uses vary fast C implementation of Gotoh alhoritm to align your fastq samples
and automates analysis across different experiments. `amplican` maintains
elasticity through configuration file, which with your fastq samples are only
requirements.
Version: 0.99.0
Authors@R: c(
person("Kornel", "Labun", email = "kornel.labun@gmail.com", role = "aut"),
person(c("Rafael", "Nozal"), "Canyadas", email = "rafanozal@gmail.com", role = "ctr"),
person("Eivind", "Valen", email = "eivind.valen@gmail.com", role = c("cph", "cre"))
)
URL: https://github.com/valenlab/amplican
BugReports: https://github.com/valenlab/amplican/issues
biocViews: Technology, qPCR, CRISPR
License: GPL-3
LazyData: TRUE
LinkingTo: Rcpp
Depends: R (>= 3.3.0)
Imports:
Rcpp,
utils,
R.utils,
seqinr,
ShortRead,
IRanges,
GenomicRanges,
S4Vectors,
doParallel,
foreach,
ggplot2,
ggbio,
stringr,
stats,
rmarkdown,
knitr,
methods
RoxygenNote: 5.0.1
Suggests:
testthat,
BiocStyle
Collate:
'RcppExports.R'
'amplican.R'
'helpers_warnings.R'
'helpers_filters.R'
'helpers_alignment.R'
'gotoh.R'
'amplicanAlign.R'
'amplicanReport.R'
'helpers_directory.R'
'helpers_plots.R'
'helpers_rmd.R'
VignetteBuilder: knitr
Your package has been approved for building. Your package is now submitted to our queue.
IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/amplican_buildreport_20160919144907.html
Received a valid push; starting a build. Commits are:
29ab577 registered for bioc-devel mailing list
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/amplican_buildreport_20160920061043.html
Received a valid push; starting a build. Commits are:
b25ac06 windows should handle system.file example in ampli...
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/amplican_buildreport_20160920073636.html
Received a valid push; starting a build. Commits are:
2812aee reverted from using paste0 to system.file entirely...
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/amplican_buildreport_20160920081305.html
Hi,
moscato1 check is complaining:
\ checking loading without being on the library search path ... WARNING Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) : there is no package called 'httr' Error: package or namespace load failed for 'amplican'
I am rather puzzled as I do not state any dependency on package 'httr' nor I do not need this package for anything I believe. Why is only windows having this problem not other systems? Should I add this package to dependencies?
Best, Kornel
I got the same warning message as yours, don't know how to solve it, neither ..
This seems to be a build system configuration issue that does not require a package author fix; @lshep might respond here with an update.
For this issue, httr is an indirect dependency via ggbio --> biovizBase --> ensembldb --> AnnotationHub
For https://github.com/Bioconductor/Contributions/issues/124, httr is an indirect dependency via methylumi --> minfi --> GEOquery
Thanks for your contribution.
DESCRIPTION
vignette
tempfile()
or tempdir()
in the vignette
/ examples, rather than writing to getwd()
.avoid long lines of code, e.g., amplicanOverview.Rmd:97 by un-nesting function calls; don't use explicit path separators but rely on the function call to use the appropriate separator for the operating system in use.
fl <- system.file("extdata", "results", "barcode_reads_filters.csv",
package = "amplican")
barcodeFilters <- read.csv(fl)
R
file.path()
rather than paste0()
to construct file paths.invisible()
, from all
functions. For instance, from amplicanPipeline()
return
results_folder
avoid misuse of ifelse()
for scalar tests (e.g.,
amplicanAlign.R:128), use
result <- if (test) {
## TRUE value
} else {
## FALSE value
}
or similar (ifelse()
is meant for use with vector arguments).
avoid repeated calls to writeLine, e.g., amplicanAlign.R:180
writeLines(
c(paste("Config file: ", "foo"),
paste("Processors used: ", 2),
paste("Skip Bad Nucleotides: ", TRUE),
...),
logFileConn)
seq_len()
/ seq_along()
rather than 1:n
/ 1:length(x)
ShortRead::sread(forwardsTable)
rather than forwardsTable@sread
,
as(quality(reads), "matrix")
instead of as(slot(reads, "quality"), "matrix")
at helpers_filters.R:16.is this line helpers_alignment.R:272 like
Biostrings::reverseComplement()
? Use Biostrings to reduce the
number of package dependencies.
seqinr::c2s(rev(seqinr::comp(seqinr::s2c(guideRNA))))
unpackFastq()
is not necessary for input to ShortRead, if that is
how it is being used.matrixStats::rowMins()
rather
than apply()
at helpers_filters.R:16 and elsewheregrepl("^[ATCG]+$", sread(reads))
or
stringr::str_detect(sread(reads), "^[ATCG]+$")
rather that
sapply()
. If the functionality of grepl()
and str_detect()
are
approximately the same and relatively equivalent in terms of
performance, then use grepl()
to reduce the number of package
dependencies. Can you provide an example of the error implied in the
comment "possible c stack limits"?src
man
Please address the points above, and when your package is again passing the build and check process correctly include a brief summary of your response to each of these points.
Do you plan to submit a revised package in time for the current release? The deadline is today.
Yes, I will try to fix what I can. I am afraid fixing Gotoh implementation so that it "It would operate on input objects (from ShortRead and Biostrings?) rather than files, and would return objects that can be computed on (Biostrings::PairwiseAlignment?)" and "The output format from gRCPP contains familar concepts (e.g., an alignment CIGAR) but in an idiosyncratic format. Present this information in a standard representation." will not be possible to do today. It would be my goal to fix this in next release. We could still expose gRCPP function to the user, do you think we should do that? I have restricted from doing this before as I am also unhappy with inputs and outputs from the gRCPP function.
I would rather see
addressed prior to accepting the package; if this can be done in the next several days then we can be more relaxed about the deadline for adding new packages to the Bioconductor release.
Received a valid push; starting a build. Commits are:
4e31808 spell check all the files 2d307a0 more spell check 9a8c0dd using file.path(), identation to 2 spaces 9743239 avoid long lines in vignette, invisible(), removed... 915fcf5 save before removing unpackfastq 5b0dd41 removed unpacking and deleting zipped files 9e1f1a5 writing to files moved to highest level possible 21ff13b biocparallel and rerun on example dataset c630445 up one version, removed R.utils from DESCRIPTION 1e35a3d Merge pull request #1 from valenlab/bioc_review B...
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/amplican_buildreport_20161007151956.html
First of all thank you for all the feedback and comments.
In this update I tried my best to fix following comments:
DESCRIPTION
vignette
[ x ] avoid long lines of code, e.g., amplicanOverview.Rmd:97 by un-nesting function calls; don't use explicit path separators but rely on the function call to use the appropriate separator for the operating system in use.
fl <- system.file("extdata", "results", "barcode_reads_filters.csv", package = "amplican") barcodeFilters <- read.csv(fl)
R
result <- if (test) {
} else {
} or similar (ifelse() is meant for use with vector arguments).
writeLines( c(paste("Config file: ", "foo"), paste("Processors used: ", 2), paste("Skip Bad Nucleotides: ", TRUE), ...), logFileConn)
src
man
After some discussion with maintainer we decided that we are going to switch from using our gotoh function to the Biostrings::pairwiseAlignment in this package. Which aligner we use is not the main substance of our contribution. amplican is meant as pipeline for high-throughput amplicon sequencing specialized for CRISPR experiments.
Also, we would like to wait with release for next Bioconductor schedule. We would like to test some more and gather more feedback from collaborators.
OK, I will close this issue. Feel free to open a new issue when your updated package is ready.
Tried to open up new issue to submit amplican again, but bioc-issue-bot complains that I have already submitted this repository more than once and it exists in issue tracker. See #454. Can we open up this submission once more? @mtmorgan @gr22772
Please perform a version bump.
I bumped the version to 0.9.100 as new start point, should it trigger the build automatically? Or should we check our web hooks? Or red error label "VERSION BUMP REQUIRED" prevents build?
It should trigger a new build; when the build is successful the 'VERSION BUMP REQUIRED' tag will be removed. Can you check the web hook?
There might be problems with having closed the issue; if the web hook is ok let me know and we'll work on it from this end.
I am sorry for so much delay, we had communication problems apparently. I confirmed that we do still have the web hook, could you work out some solution for our submission? Maybe it would be easier to close this issue, remove it from github completely and resubmit package (package changed so much that previous comments are no longer relevant I believe)? Or I could change the name to ampliCan - I made the name in lower case,so its easier to type, but the actual name on our logo is ampliCan.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/amplican_buildreport_20170828171931.html
I suspect that your unit test fails because it uses the same directory on each architecture -- results_folder <- ...
should be something like results_folder <- tempfile(); dir.create(results_folder)
.
Also please confirm on next version bump (an increment of to z+1 for version x.y.z is sufficient) that the web hook runs, or at least what the return value is, under settings --> web hooks --> edit and then choose the hook(s) and look at 'Response'.
Just made commit into 0.9.101 and web hook returns above.
Great, thanks, I added the 'review in progress' label for the future, and triggered a manual rebuild.
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/amplican_buildreport_20170829082652.html
Great! Thank you!
During code review, if you have any suggestions how to speed up getEventInfo function (helpers_general.R) it would be great as this is main bottleneck (not the alignment process in itself). The goal is to extract deletions, insertions and mismatches from PairwiseAlignmentsSingleSubject class into GRanges object with metadata columns. Main issue is that extracting deletions with natural Biostrings::deletion returns ranges not from the subject point of view. I get around this by shifting deletions for each insertion beforehand if any, but its slow. If there would be a way to vectorize this process (maybe C level Biostrings library?) I would be grateful for advice on how to achieve that. For this moment, current implementation works and is properly tested in test_alignment_helpers.R.
Received a valid push; starting a build. Commits are:
353d7c9 allow for mismatches in primers, make sure plots h... 3cc8e86 improve consensus alghoritm so that it accounts fo... a94a0f1 change params of alignments, fix when no ins in va... dbfd7b5 add ampliconConsensus picture and explanation, rev...
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/amplican_buildreport_20170918093042.html
Sorry to be slow in returning comments. Below are some minor general things. If you can provide an easy way for me to run an example with getEventInfo()
then I'll be happy to work on it in the short term.
Seems like the one-year anniversary of my initial review!
DESCRIPTION / NAMESPACE
vignette
R
use accessors instead of direct slot access, e.g., in validity method replace object@experimentData
with experimentData(object)
x
AlignmentsExperimentSet-class.R:76 ensure that enough information is displayed to help the user rather than (as it appears) an overwhelming amount of data.
cat()
, message()
, warning()
, stop()
do not usually require paste()
inside them, cat(paste("foo", "bar"))
--> cat("foo", "bar")
amplicanAlign.R:79 it is better practice to leave BPPARAM unspecified, so the user has control through BiocParallel::register()
. The conditional could be simplified as
if (use_parallel)
p = BiocParallel::bpparam() # user choice
else
p = BiocParallel::SerialParam() # standard lapply
configSplit <- split(cfgT, f = cfgT$Barcode)
finalAES <- BiocParallel::bplapply(configSplit, FUN = makeAlignment,
average_quality,
min_quality,
scoring_matrix,
gap_opening,
gap_extension,
fastqfiles,
primer_mismatch, BPPARAM=p)
amplicanAlign.R:104 do.call(c, x)
is usually more efficient than Reduce(c, x)
Received a valid push; starting a build. Commits are:
de1ad33 bioconductor review + change gap opening to 25
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the following build report for more details:
http://bioconductor.org/spb_reports/amplican_buildreport_20170921135804.html
Thank you for feedback. Here is gist with getEventsInfo, if you can suggest something to make it faster, me and future users will be grateful.
DESCRIPTION / NAMESPACE
vignette
R
[ x ] use accessors instead of direct slot access, e.g., in validity method replace object@experimentData with experimentData(object)
[ ? ] AlignmentsExperimentSet-class.R:76 ensure that enough information is displayed to help the user rather than (as it appears) an overwhelming amount of data. - Currently show method prints information only for the first experiment. I changed readCounts to print wth the use of str(). From my experience it is easier to manipulate object, if I can look up first element of it, even if its a bit longer.
[ x ] cat(), message(), warning(), stop() do not usually require paste() inside them, cat(paste("foo", "bar")) --> cat("foo", "bar")
[ x ] amplicanAlign.R:79 it is better practice to leave BPPARAM unspecified, so the user has control through BiocParallel::register(). The conditional could be simplified as
if (use_parallel) p = BiocParallel::bpparam() # user choice else p = BiocParallel::SerialParam() # standard lapply configSplit <- split(cfgT, f = cfgT$Barcode) finalAES <- BiocParallel::bplapply(configSplit, FUN = makeAlignment, average_quality, min_quality, scoring_matrix, gap_opening, gap_extension, fastqfiles, primer_mismatch, BPPARAM=p)
I looked quite a bit a getEventInfo, although I'm not actually familiar with the aligned string representations in Biostrings. I did not come up with meaning performance improvements. Some minor changes include:
Use the constructor rather than construct-and-assign in defGR()
,
GenomicRanges::GRanges(
ranges = x,
strand = strand_info,
seqnames = ID,
originally = as.character(originally),
replacement = as.character(replacement),
type = type,
read_id = names(x),
score = score
)
minimize operations within conditionals, e.g.,
shift_to_subj <- function(x, ampl_shift, subject, strand_info) {
if (strand_info == "+") {
delta <- 1L
} else {
delta <- stringr::str_count(subject, "[ATCG]")
}
IRanges::shift(x, ampl_shift - delta)
}
and
width <- nchar(align)
if (strand_info == "+") {
s_err <- width + ampl_shift - 1L >= ampl_len
start <- width[!s_err] + ampl_shift
end <- if (all(s_err)) integer() else ampl_len
} else {
s_err <- width + abs(ampl_shift - ampl_len) >= ampl_len
start <- if (all(s_err)) integer() else 1L
end <- ampl_shift - width[!s_err]
}
sizes <- IRanges::IRanges(start = start, end = end, names = which(!s_err))
avoid nested iterations with vectorization, e.g.,
ins_r <- rep(ins, each = lengths(del))
del_s <- unlist(start(del))
sft <- -1 * sum(width(ins_r)[del_s > start(ins_r)])
del_sft <- relist(sft, del)
del <- IRanges::shift(del, del_sft)
this is actually a little slower than an intermediate solution that hoists the accessors out of the iteration:
shift_del <- mendoapply(function(x, y, w) {
vapply(
x,
function(x_i, y, w) sum(w[x_i > y]),
integer(1),
y, w
)
}, BiocGenerics::start(del), BiocGenerics::start(ins), BiocGenerics::width(ins))
del <- IRanges::shift(del, -1 * shift_del)
You can either incorporate these changes or not; let me know via a comment and I will accept the package.
Thank you for your effort and that you care! Do you think implementing some parts in C++ (maybe mendoapply(function(x, y, w) part) would give any benefits?
no; about 30% of the time is in mismatchSummary()
, which is doing complicated queries on events
; it would be tedious and error-prone to do that in C. I think you could get a speed-up by iterating on the events
part
width <- nchar(align)
subj <- as.character(subject(align))
pat <- Biostrings::pattern(align)
del <- Biostrings::deletion(align)
ins <- Biostrings::insertion(align)
mm <- Biostrings::mismatchSummary(align)$subject
and collapsing the result of the iteration into Vectors and a partitioning, allowing for vectorization, but that would be a little (not impossible) tedious.
I'll accept this package now; further performance improvements can be pursued once it's in Bioconductor.
Your package has been accepted. It will be added to the Bioconductor Git repository and nightly builds. Additional information will be sent to the maintainer email address in the next several days.
Thank you for contributing to Bioconductor!
Alright, Thank you!
The master branch of your GitHub repository has been added to Bioconductor's git repository.
To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (tithub.com/
See further instructions at
https://bioconductor.org/developers/how-to/git/
for working with this repository. See especially
https://bioconductor.org/developers/how-to/git/new-package-workflow/
https://bioconductor.org/developers/how-to/git/sync-existing-repositories/
to keep your GitHub and Bioconductor repositories in sync.
Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at
https://bioconductor.org/checkResults/
(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using biocLite("YOUR_PACKAGE_NAME")
. The package 'landing page' will be created at
https://bioconductor.org/packages/YOUR_PACKAGE_NAME
If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
I am familiar with the essential aspects of Bioconductor software management, including:
For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.