Bioconductor / Contributions

Contribute Packages to Bioconductor
134 stars 33 forks source link

CircSeqAlignTk #2742

Closed jsun closed 1 year ago

jsun commented 2 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot commented 2 years ago

Hi @jsun

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: CircSeqAlignTk
Type: Package
Title: A toolkit for end-to-end analysis of RNA-seq data for circular genomes
Version: 0.99.0
Authors@R: 
    c(person("Jianqiang", "Sun", , "sun@biunit.dev", role = c("aut", "cre"),
 comment = c(ORCID = "0000-0002-3438-3199")),
      person("Xi", "Fu", , role = c("aut")),
      person("Wei", "Cao", role = c("aut")))
Description: 
    CircSeqAlignTk is an R package for end-to-end analysis,
    from alignment to visualization,
    of RNA-seq data obtained from organelles or organisms
    with circular genome sequences,
    such as bacteria, viruses, and viroids.
    In addition, CircSeqAlignTk implements a tidy interface
    to generate synthetic sequencing data that mimic real RNA-seq data,
    allowing developers to evaluate the performance of alignment tools,
    new alignment algorithms, and new workflows.
Depends: R (>= 4.2)
Imports:
    stats,
    tools,
    utils,
    methods,
    S4Vectors,
    rlang,
    magrittr,
    dplyr,
    tidyr,
    ggplot2,
    BiocGenerics,
    Biostrings,
    IRanges,
    ShortRead,
    Rsamtools,
    Rbowtie2,
    Rhisat2
Suggests:
    knitr,
    rmarkdown,
    R.utils,
    BiocStyle
VignetteBuilder: knitr
biocViews: Sequencing, SmallRNA, Alignment, Software
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.0
URL: https://github.com/jsun/vsrnaseqtk
BugReports: https://github.com/jsun/vsrnaseqtk/issues
vjcitn commented 2 years ago

I built the vignette and noted a lot of sim_tries and other files created in the folder. You should follow http://contributions.bioconductor.org/r-code.html?q=BiocFileCache#web-querying-and-file-caching and do i/o in the temp folder

jsun commented 2 years ago

I've changed the scripts in the vignette to create files under a temporary directory.

vjcitn commented 2 years ago

This is what I am seeing on current checkout

* checking Rd cross-references ... WARNING
Unknown package ‘RBowtie2’ in Rd xrefs
Missing link or links in documentation object 'calc_coverage.Rd':
  ‘CircSeqAlignTk-class’

Missing link or links in documentation object 'generate_reads.Rd':
  ‘fCircSeqAlignTkSim-class’

See section 'Cross-references' in the 'Writing R Extensions' manual.
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... WARNING
Undocumented arguments in documentation object 'align_reads'
  ‘read_lengths’

Undocumented arguments in documentation object 'calc_coverage'
  ‘aln’
Documented arguments not in \usage in documentation object 'calc_coverage':
  ‘x’

Undocumented arguments in documentation object 'plot_coverage'
  ‘...’

Functions with \usage entries need to have the appropriate \alias
entries, and all their arguments documented.
The \usage entries must correspond to syntactically valid R code.
See chapter 'Writing R documentation files' in the 'Writing R
Extensions' manual.
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking installed files from 'inst/doc' ... OK
* checking files in 'vignettes' ... OK
* checking examples ... OK
* checking for unstated dependencies in 'tests' ... WARNING
'library' or 'require' call not declared from: ‘testthat’
* checking tests ...
  Running ‘testthat.R’
 ERROR
Running the tests in 'tests/testthat.R' failed.
Last 13 lines of output:
  ══ Failed tests ════════════════════════════════════════════════════════════════
  ── Error (test_workflow.R:8:5): workflow with Bowtie2 ──────────────────────────
  Error in `build_index(fasta = fa, output = "tmp_bt2index")`: unused argument (fasta = fa)
  ── Error (test_workflow.R:21:5): workflow with HISAT2 ──────────────────────────
  Error in `build_index(fasta = fa, output = "tmp_ht2index", aligner = "hisat2")`: unused argument (fasta = fa)
  ── Error (test_workflow.R:44:5): workflow - no reads aligned on edges ──────────
  Error in `build_index(fasta = fa, output = "tmp_index")`: unused argument (fasta = fa)
  ── Error (test_workflow.R:63:5): workflow - reads aligned on edges ─────────────
  Error in `build_index(fasta = fa, output = "tmp_index")`: unused argument (fasta = fa)
  ── Error (test_workflow.R:73:5): workflow - reads without adapter and mismatches correctly aligned ──
  Error in `build_index(fasta = viroid_seq, output = "tmp_index")`: unused argument (fasta = viroid_seq)

  [ FAIL 5 | WARN 0 | SKIP 0 | PASS 0 ]
  Error: Test failures
  Execution halted
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes in 'inst/doc' ... OK
* checking running R code from vignettes ...
  'CircSeqAlignTk.Rmd' using 'UTF-8'... OK
 NONE

is it passing check for you? please verify on a clean system, eliminate warnings and errors.

jsun commented 2 years ago

I'm so sorry to have troubled you. I've fixed the codes and checked the code under the clean environment. Many thanks.

bioc-issue-bot commented 2 years ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 2 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CircSeqAlignTk to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

Kayla-Morrell commented 1 year ago

@jsun - Thank you for submitting to Bioconductor. Please see the initial review of the package below. The required changes must be made while the suggested changes do not have to be (though we strongly encourage them). Comment back here with updates that have been made and when the package is ready for a re-review.

General package development

DESCRIPTION

NEWS

Data

Vignette

Man pages

R code

Best, Kayla

jsun commented 1 year ago

Hi, @Kayla-Morrell, thank you for checking our package, we will update the package according to your comments as soon as possible. I'll let you know after we have corrected all requirements. Thanks.

General package development

  • [x] SUGGESTION: I would also add to the installation instructions how the user can install CircSeqAlignTk from Biocoductor.

Of course; we will change the installation instructions to using BiocManager::install after the acceptance.

DESCRIPTION

  • [x] SUGGESTION: The Description field in the DESCRIPTION is made up by less than 3 sentences. Please consider expanding this field, and structure it as a full paragraph.

Done.

  • [x] SUGGESTION: Consider adding the automatically suggested biocView: SplicedAlignment.

The processes performed by this package don't relate to the spliced alignment.

NEWS

  • [x] REQUIRED: It is great to include a NEWS file to keep track of changes within the package. If you are going to include this file please populate some information so we can be sure it will be formatted properly.

Done.

Data

  • [x] REQUIRED: We require all data within inst/extdata to have corresponding documentation within an inst/script directory. The scripts in this directory can vary. But most importantly they should clearly document how the data in inst/extdata was generated and source information. It should include source URLs and any key information regarding filtering or processing. It can be executable code, sudo code, or a text description. Users should be able to download and be able to roughly reproduce the file or object that is present as data.

Done.

Vignette

  • [x] REQUIRED: Most, if not all, code blocks in the vignette need to be executable. This means that there should be as little eval=FALSE usage as possible.

We removed almost chunks that specified eval=FALSE. Now, only the necessary chunks with eval=FALSE are remined.

  • [x] REQUIRED: Section 3.4 - Summarization and visualization of alignment results I have a different results for:
> head(slot(alncov, 'forward'))
     L21 L22 L23 L24
[1,]  10   8   1   1
[2,]  10   8   1   1
[3,]  10   8   1   1
[4,]  10   9   1   1
[5,]  10   9   1   1
[6,]  10   9   1   1

Is this expected? Since this is different for me, my plots in this section end up being a bit different than the ones I see in the vignette.

It is fine that the results are not the same, since the result depends on the version of Bowtie2 (or Rbowtie2).

  • [x] REQUIRED: Section 4.1 - Generation of synthetic sequence reads I have a different result for:
> head(slot(sim, 'peak'))
  mean std strand         prob
1   53   4      - 0.2266771620
2  239   3      + 0.1221685263
3  307   2      - 0.0004017685
4  343   3      - 0.1309113862
5  237   5      - 0.0936213551
6  244   1      + 0.2162145535
>

Is this expected? It seems the rest of my results for this section are different then what I see in the vignette.

We updated the scripts in the vignette and checked the macOS and Ubuntu (Docker) output with the same result now. If you still have a different result, would you tell me your OS and R version, please? Thanks.

  • [x] REQUIRED: Section 4.2 - Examples of sequence read generation with additional parameters there is an extra ) at the end of the sequence when defining genome_seq. If the user copies this code and pastes into R it will result in an error.

Fixed.

  • [x] REQUIRED: At the end of Section 4.3 - Performance evaluation with the synthetic data I get different results for:
> # coverage of reads in forward strand
> fwd_pred <- slot(alncov, 'forward')
> fwd_true <- slot(slot(sim, 'coverage'), 'forward')
> sqrt(sum((fwd_pred - fwd_true) ^ 2) / length(fwd_true))
[1] 5.10784

and

> # coverage of reads in reversed strand
> rev_pred <- slot(alncov, 'reversed')
> rev_true <- slot(slot(sim, 'coverage'), 'reversed')
> sqrt(sum((rev_pred - rev_true) ^ 2) / length(rev_true))
[1] 1.987844

Is this expected?

It is fine that the results are not the same, since the result depends on the version of Bowtie2 (or Rbowtie2).

  • [x] REQUIRED: Similarly to what you do in the examples, I would have the code that writes files write them to a temporary file. When I ran through your entire vignette locally I ended up with many directories and files that remained after my R session was closed.

We updated the vignette; let the analysis perform under the temporary directory.

Man pages

  • [x] REQUIRED: In the man page for 'calc_coverage' you don't show how aln is defined. So if the user looked up this man page they would only see alncov <- calc_coverage(aln) under the examples. This is not helpful to the user. You should show all the code here.

Done.

R code

  • [x] REQUIRED: When running R CMD check on the package built tarball, the step "checking examples..." tries to open a webpage that isn't successful. The check doesn't ERROR out but the webpage results in "Safari Can't Connect to the Server" "Safari can't open the page '127.0.0.1:21114/session/Rvig.9d1bb6aba3.html' because Safari can't connect to the server '127.0.0.1'". I'm not sure if this is something that you experience when you run R CMD check locally?

In function documentation (man/CircSeqAlignTk-package.Rd), we show an example code to call the vignette with browseVignettes function. It seems that this function starts up the browser automatically. We don't want to add \dontrun in this case; but is there any recommendation to add \dontrun to this example?

  • [x] REQUIRED: Avoid use of direct slot access with @ or slot(). Accessor methods should be created and utilized.

We added get_slot_contents function to avoid using slot or @ directly.

  • [x] REQUIRED: Avoid 'suppressWarnings' / 'Messages' if possible.

This package internally calls Rbowtie2 or Rhisat2 to perform some processes, and the functions in these package output a lot of log messages. Here, we suppress these messages and warnings for readability, this can let users catch the important message and warning easily. Should we remove 'suppressWarnings' / 'Messages' from the code even if we have suitable reasons?

  • [x] SUGGESTION: For formatting reasons, consider shorter lines. There are 59 lines that are > 80 characters long.

We used easy-to-understand variable names; this makes the variable names longer. As a result, some lines are longer than 80 characters. For readability, we will not short these lines, is this OK?

  • [x] SUGGESTION: For formatting reasons, consider multiples of 4 spaces for line indents. There are 194 lines that are not.

For readability, we will keep the current coding style, is this OK?

jsun commented 1 year ago

Hi, @Kayla-Morrell

We updated the package according to your comments, would you please check the package again? Many thanks. If there are further comments, please let me know. For example, we didn't remove suppressWarning according to some reason; but if it still strongly recommends removing it, please let us know. Many thanks.

Kayla-Morrell commented 1 year ago

Hello @jsun - I see that you have pushed the changes to your GitHub but they need to be pushed to our git version of the package. See the comment made here about setting up remote to push to git.bioconductor.org.

jsun commented 1 year ago

Thanks, can I push to git.bioconductor.org at this stage? Because I tried to push the update but I got the permission denied error.

git remote -v
## origin   git@github.com:jsun/CircSeqAlignTk.git (fetch)
## origin   git@github.com:jsun/CircSeqAlignTk.git (push)
## upstream git@git.bioconductor.org:packages/CircSeqAlignTk.git (fetch)
## upstream git@git.bioconductor.org:packages/CircSeqAlignTk.git (push)

git push upstream main:master
## git@git.bioconductor.org: Permission denied (publickey).
## fatal: Could not read from remote repository.
##
## Please make sure you have the correct access rights
## and the repository exists.
lshep commented 1 year ago

Please activate your GitCredentials account. I will send private email with a temporary password for activation. Once that occurs you can manage your ssh keys for access.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 62a18d337272f44a284f37ffb7ce8b1e0f1cffee

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/CircSeqAlignTk to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

jsun commented 1 year ago

Hi, @Kayla-Morrell , thanks to @lshep 's helping, it seems that I successfully pushed the update to BioC git.

Kayla-Morrell commented 1 year ago

@jsun - Thank you for making the necessary changes. I have looked everything over and it all looks good. I'm more than happy to accept the package.

Best, Kayla

bioc-issue-bot commented 1 year ago

Your package has been accepted. It will be added to the Bioconductor nightly builds.

Thank you for contributing to Bioconductor!

Reviewers for Bioconductor packages are volunteers from the Bioconductor community. If you are interested in becoming a Bioconductor package reviewer, please see Reviewers Expectations.

lshep commented 1 year ago

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/jsun.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("CircSeqAlignTk"). The package 'landing page' will be created at

https://bioconductor.org/packages/CircSeqAlignTk

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.