Bioconductor / Contributions

Contribute Packages to Bioconductor
134 stars 33 forks source link

(inactive) qckitfastq #753

Closed winnie0521 closed 5 years ago

winnie0521 commented 6 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 6 years ago

Hi @winnie0521

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: qckitfastq
Type: Package
Title: FASTQ Quality Control
Version: 0.99.0
Author: Wenyue Xing
Maintainer: August Guang <august.guang@gmail.com>
Description: Assessment of FASTQ file format with multiple metrics including quality score, sequence content, overrepresented sequence and Kmers.  
License: Artistic-2.0
Encoding: UTF-8
LazyData: false
RoxygenNote: 6.0.1.9000
SystemRequirements: C++11, GNU make
biocViews:Software,QualityControl,Sequencing
LinkingTo: Rcpp
Imports:
  magrittr,
  ggplot2,
  dplyr,
  seqTools,
  zlibbioc, 
  testthat,
  data.table,
  reshape2,
  grDevices,
  graphics,
  stats,
  utils,
  Rcpp
Suggests: knitr,
    rmarkdown
VignetteBuilder: knitr

Add SSH keys to your GitHub account. SSH keys will are used to control access to accepted Bioconductor packages. See these instructions to add SSH keys to your GitHub account.

bioc-issue-bot commented 6 years ago

A reviewer has been assigned to your package Learn what to expect during the review process.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

6ab44b8 .2version

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

dbc930f description version 3

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

2064b12 5

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

c0c61f0 6

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

62d47d4 7

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

d99aad6 import count from dply 7b8ff79 import count from dply

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

bd4d7d5 version9 makevar

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

7b15c3c version10 build

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

hpages commented 6 years ago

Hi @winnie0521 ,

Thanks for your submission. Note that the following line is all you need in your Makevars file:

PKG_LIBS = -lz

I notified the author of the zlibbioc vignette that the documentation is a little bit misleading. He will update it soon.

Regards, H.

bioc-issue-bot commented 6 years ago

Received a valid push; starting a build. Commits are:

d7bcec9 version11 makevar change

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

bioc-issue-bot commented 6 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

hpages commented 6 years ago

Hi @winnie0521

I looked at qckitfastq and have some feedback for you:

Value

  1. First of all: R packages for doing quality control of sequencing data have been available for many years. A popular Bioconductor package for this is ShortRead. There is also fastqcr on CRAN that wraps FASTQC functionality in an R package. So my question is: what qckitfastq has to offer that is not already available via ShortRead or fastqcr?

Vignette

  1. Your vignette needs an introduction explaining what the package is for. Also please provide some context and the motivations e.g. what is the typical use case for the package, on what kind of data it can be used, coming from what kind of experiment etc... Put yourself in the shoes of your reader. Don't take for granted that things that are obvious for you are obvious for your reader.

  2. Please capitalize the first letter in the first word of your titles (main title, sections and subsection titles).

  3. Do not use devtools::install_github in your vignette to install the package. Once a package is part of Bioconductor, it should be installed with biocLite() i.e. with:

    source("https://bioconductor.org/biocLite.R")
    biocLite("qckitfastq")

    so please show this instead. See https://bioconductor.org/install/ for the detailed instructions. It's ok to add a note showing how to install the development version of the package though but please make it clear that, unless the user knows what s/he's doing, this should be avoided as it can lead to package version mismatch hell.

  4. Also use eval=FALSE for this code chunk otherwise the package will re-install itself everytime its vignette gets built (e.g. when someone or the build system runs R CMD build on it). More generally speaking, the code in the man pages, vignettes and unit tests should not effectively install packages.

  5. The vignette says:

    For some functions based on seqTools, we will first process the file through seqTools and then let qckitfastq functions proceed with the analysis.

    Please explain this. What seqTools::fastqq does exactly? Why do we need to do this? Couldn't the functions in qckitfastq do this for us?

  6. Please briefly comment the result of each step of the analysis.

Man pages

  1. Man page for dimensions mentions ncolumn but there is no such function.

  2. Overall man pages are too minimalist.

Other

  1. Please address the following notes from the Single Package Builder:

    * NOTE: Avoid sapply(); use vapply() found in files:
      over_rep_kmer.R (line 82, column 25)
    * NOTE: Avoid 1:...; use seq_len() or seq_along() found in files:
      over_rep_kmer.R (line 37, column 13)
      plot_overep_seq.R (line 21, column 21)
      plot_overep_seq.R (line 25, column 23)
  2. Many of your functions provide the writefile and prefix arguments let the user write the result to a file or not. Not only this is clumpsy design but also this feature is poorly documented (the man pages don't say a thing about the location and name of the file by default). So the feature is inconvenient to use and obscure. A better (and simpler) design would be to provide a single argument e.g. output_file that is NA by default and to write the result to a file when this argument is not NA (it must then be a single string specifying the path to the file). If you want a good file name to be used by default, another approach is to provide 2 arguments, write_output (similar to your writefile argument) and output_file with a default value for output_file e.g.

    plot_quality_score <- function(basic_statistics, write_output=FALSE, output_file="quality_score.pdf")
    {
        ...
    }
  3. Why specify writefile=FALSE in your examples (man pages and vignette) when this is the default? The fact that you explicitely specify writefile=FALSE is distracting and potentially misleading because it suggests that a file is written by default.

  4. C++ code: using 100 lines of code to initialize ascii_map when you could do this in 2 lines with a loop is very poor coding practice. Even worse: you do this twice! (Once in the process_fastq() function and once again in the qual_score_per_read() function.) Furthermore it seems that the only purpose of ascii_map is to map ascii characters to their ascii code minus 33. Please realize that you could just do (int) c - 33 for this. So you don't need ascii_map at all.

    Keeping things as simple as possible is one of the first and most important pre-requisite for good programming!

hpages commented 6 years ago

Hi @winnie0521 -- Are you planning to followup on this submission? If not, we'll close the issue. Thanks! H.

aguang commented 6 years ago

@hpages Hi, I will be following up on the submission, thank you. Will try to get in a response by the end of next week.

lshep commented 6 years ago

This package has been idle for quite some time. Please push changes or we will consider the issue inactive and close for inactive. You will be able to reopen the issue at a later date when you can actively participate in the review process.

bioc-issue-bot commented 6 years ago

This issue is being closed because there has been no progress for an extended period of time. You may reopen the issue when you have the time to actively participate in the review / submission process. Please also keep in mind that a package accepted to Bioconductor requires a commitment on your part to ongoing maintenance.

Thank you for interest in Bioconductor.

aguang commented 5 years ago

Hi, I would like to reopen this issue, what is the way to do so?

aguang commented 5 years ago

I pushed commits to master which got a 200 response from the webhook, but it doesn't seem like bioc-issue-bot is building the package. Does the tag need to be changed?

Also, I have addressed concerns in the previous review in the repo in response.md.

lshep commented 5 years ago

Please ensure the webhook is still initialized properly and then try a version bump again.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

dea38e9 version bump for biocbot 7ae8d13 Merge branch 'master' of https://github.com/compbi...

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

35f03e1 addressed biocbot build errors

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

01d3136 #define STRICT_R_HEADER

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

3403ad9 sped up calc_adapter_content with StringSet

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

6d3c4df cleaned up code and tests

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

hpages commented 5 years ago

@aguang

Thanks for your work on qckitfastq. Please let me know when the package is ready for a 2nd review.

Thanks again, H.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

5ab0b96 mistake in example c743c0c update version number

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

aguang commented 5 years ago

@hpages The package is ready, thanks.

hpages commented 5 years ago

Hi @aguang,

Thanks for all the work you're putting on qckitfastq.

Please note that we don't support C++14 yet on our Windows build machines so we'll need to mark qckitfastq as not supported on this platform. We also need to do the same for RSeqAn which is currently failing to install on Windows in devel (see https://bioconductor.org/checkResults/3.9/bioc-LATEST/RSeqAn/). This dependency on C++14 is unfortunate and will likely hurt the deployment of your package on clusters with slow-moving enterprise Linux distributions.

Thanks for the much improved vignette. Here is some additional feedback about the vignette:

  1. I was curious to take a look at the "qckit project page" but the link provided in the vignette does not work.

  2. The top-level categorization of the various metrics ("Basic metrics", "C++ based metrics", "seqTools::fastqq based metrics") feels arbitrary and lacks relevance from an end-user point of view (e.g. the particular programming language used to implement the calculation of a metrics is really an implementation detail that is not relevant for the end-user). Maybe a more useful categorization would be paying attention to what the metrics are about.

  3. "Basic metrics" is currently a subsubsection, directly inside the "Individual metrics" section, with no subsection in between. Shouldn't it be a subsection instead of a subsubsection? This would put it at the same level as the "C++ based metrics" and "seqTools::fastqq based metrics" sections.

  4. More generally it feels like the overall structure of the document needs some attention, with a special attention to the nesting level of sections, subsections, and subsubsections. For example do you really want "Kmers count per base" and "Overrepresented kmers" to be subsections of the "Individual metrics" section at the same level as the "Basic metrics", "C++ based metrics", and "seqTools::fastqq based metrics" subsections?

  5. The bibliography at the end of the vignette should be in its own section. Also in the body of the document references should be linked to their corresponding entry in the bibliography.

Thanks again, H.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

7e552c8 split plot_read_length into read_length and plot 0ca33fa caching packages 4703787 perseq_quality 7bd37a4 refactoring c7e4cc7 updated vignette to include adapter content 3f6460d updated vignette yaml 9721f20 update version

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

aguang commented 5 years ago

Hi @hpages,

Thanks for your review. There is only 1 main function that depends on C++14, which is calc_adapter_content to detect adapter contamination in reads. I can remove the function for Windows builds if there is a way to flag that as a possible solution. Should I ignore the current errors for the Windows build then?

To respond to the other points:

  1. Sorry, someone else on my team is supposed to work on the qckit project page, but it is not up yet. I have removed the link for now. 15-18. Thanks for this, I have reorganized the vignette to just detail individual metrics rather than by what particular programming language they use. In general the vignette should be more organized with a description for each metric and plot and table styles. The bibliography also now has its own section.
hpages commented 5 years ago

Hi @aguang,

Feel free to use conditionals to disable the calc_adapter_content function on Windows. Then, at the R level, the adapter_content() function should fail graciously on this platform (with an error message saying something like "this function is not available on Windows, sorry"). Finally the man page for adapter_content() and the vignette would also need to use conditionals to disable calls to the function on Windows. If you decide to go that route, then you'll know that you've succeeded when qckitfastq installs, builds, and passes check on this platform.

Thanks for the improvements to the vignette. Please note that you're using commas improperly (and inconsistently) to separate the first and last names of the authors in the References section (e.g. "Andrews, Simon" or "Ewels, Philip"). Also, as mentioned earlier, it would be really nice if the references in the text were linked to their corresponding entry in the References section.

Thanks again, H.