Closed jsun closed 1 year ago
Hi @jsun
Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.
The DESCRIPTION file for this package is:
Package: CircSeqAlignTk
Type: Package
Title: A toolkit for end-to-end analysis of RNA-seq data for circular genomes
Version: 0.99.0
Authors@R:
c(person("Jianqiang", "Sun", , "sun@biunit.dev", role = c("aut", "cre"),
comment = c(ORCID = "0000-0002-3438-3199")),
person("Xi", "Fu", , role = c("aut")),
person("Wei", "Cao", role = c("aut")))
Description:
CircSeqAlignTk is an R package for end-to-end analysis,
from alignment to visualization,
of RNA-seq data obtained from organelles or organisms
with circular genome sequences,
such as bacteria, viruses, and viroids.
In addition, CircSeqAlignTk implements a tidy interface
to generate synthetic sequencing data that mimic real RNA-seq data,
allowing developers to evaluate the performance of alignment tools,
new alignment algorithms, and new workflows.
Depends: R (>= 4.2)
Imports:
stats,
tools,
utils,
methods,
S4Vectors,
rlang,
magrittr,
dplyr,
tidyr,
ggplot2,
BiocGenerics,
Biostrings,
IRanges,
ShortRead,
Rsamtools,
Rbowtie2,
Rhisat2
Suggests:
knitr,
rmarkdown,
R.utils,
BiocStyle
VignetteBuilder: knitr
biocViews: Sequencing, SmallRNA, Alignment, Software
License: MIT + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.0
URL: https://github.com/jsun/vsrnaseqtk
BugReports: https://github.com/jsun/vsrnaseqtk/issues
I built the vignette and noted a lot of sim_tries and other files created in the folder. You should follow http://contributions.bioconductor.org/r-code.html?q=BiocFileCache#web-querying-and-file-caching and do i/o in the temp folder
I've changed the scripts in the vignette to create files under a temporary directory.
This is what I am seeing on current checkout
* checking Rd cross-references ... WARNING
Unknown package ‘RBowtie2’ in Rd xrefs
Missing link or links in documentation object 'calc_coverage.Rd':
‘CircSeqAlignTk-class’
Missing link or links in documentation object 'generate_reads.Rd':
‘fCircSeqAlignTkSim-class’
See section 'Cross-references' in the 'Writing R Extensions' manual.
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... WARNING
Undocumented arguments in documentation object 'align_reads'
‘read_lengths’
Undocumented arguments in documentation object 'calc_coverage'
‘aln’
Documented arguments not in \usage in documentation object 'calc_coverage':
‘x’
Undocumented arguments in documentation object 'plot_coverage'
‘...’
Functions with \usage entries need to have the appropriate \alias
entries, and all their arguments documented.
The \usage entries must correspond to syntactically valid R code.
See chapter 'Writing R documentation files' in the 'Writing R
Extensions' manual.
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking installed files from 'inst/doc' ... OK
* checking files in 'vignettes' ... OK
* checking examples ... OK
* checking for unstated dependencies in 'tests' ... WARNING
'library' or 'require' call not declared from: ‘testthat’
* checking tests ...
Running ‘testthat.R’
ERROR
Running the tests in 'tests/testthat.R' failed.
Last 13 lines of output:
══ Failed tests ════════════════════════════════════════════════════════════════
── Error (test_workflow.R:8:5): workflow with Bowtie2 ──────────────────────────
Error in `build_index(fasta = fa, output = "tmp_bt2index")`: unused argument (fasta = fa)
── Error (test_workflow.R:21:5): workflow with HISAT2 ──────────────────────────
Error in `build_index(fasta = fa, output = "tmp_ht2index", aligner = "hisat2")`: unused argument (fasta = fa)
── Error (test_workflow.R:44:5): workflow - no reads aligned on edges ──────────
Error in `build_index(fasta = fa, output = "tmp_index")`: unused argument (fasta = fa)
── Error (test_workflow.R:63:5): workflow - reads aligned on edges ─────────────
Error in `build_index(fasta = fa, output = "tmp_index")`: unused argument (fasta = fa)
── Error (test_workflow.R:73:5): workflow - reads without adapter and mismatches correctly aligned ──
Error in `build_index(fasta = viroid_seq, output = "tmp_index")`: unused argument (fasta = viroid_seq)
[ FAIL 5 | WARN 0 | SKIP 0 | PASS 0 ]
Error: Test failures
Execution halted
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes in 'inst/doc' ... OK
* checking running R code from vignettes ...
'CircSeqAlignTk.Rmd' using 'UTF-8'... OK
NONE
is it passing check for you? please verify on a clean system, eliminate warnings and errors.
I'm so sorry to have troubled you. I've fixed the codes and checked the code under the clean environment. Many thanks.
A reviewer has been assigned to your package. Learn what to expect during the review process.
IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.
Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CircSeqAlignTk
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
@jsun - Thank you for submitting to Bioconductor. Please see the initial review of the package below. The required changes must be made while the suggested changes do not have to be (though we strongly encourage them). Comment back here with updates that have been made and when the package is ready for a re-review.
inst/extdata
to have corresponding
documentation within an inst/script
directory. The scripts in this directory
can vary. But most importantly they should clearly document how the data in
inst/extdata
was generated and source information. It should include source
URLs and any key information regarding filtering or processing. It can be
executable code, sudo code, or a text description. Users should be able to
download and be able to roughly reproduce the file or object that is present as
data.eval=FALSE
usage as
possible.> head(slot(alncov, 'forward'))
L21 L22 L23 L24
[1,] 10 8 1 1
[2,] 10 8 1 1
[3,] 10 8 1 1
[4,] 10 9 1 1
[5,] 10 9 1 1
[6,] 10 9 1 1
Is this expected? Since this is different for me, my plots in this section end up being a bit different than the ones I see in the vignette.
> head(slot(sim, 'peak'))
mean std strand prob
1 53 4 - 0.2266771620
2 239 3 + 0.1221685263
3 307 2 - 0.0004017685
4 343 3 - 0.1309113862
5 237 5 - 0.0936213551
6 244 1 + 0.2162145535
>
Is this expected? It seems the rest of my results for this section are different then what I see in the vignette.
)
at the end of the sequence when
defining genome_seq
. If the user copies this code and pastes into R it will
result in an error.> # coverage of reads in forward strand
> fwd_pred <- slot(alncov, 'forward')
> fwd_true <- slot(slot(sim, 'coverage'), 'forward')
> sqrt(sum((fwd_pred - fwd_true) ^ 2) / length(fwd_true))
[1] 5.10784
and
> # coverage of reads in reversed strand
> rev_pred <- slot(alncov, 'reversed')
> rev_true <- slot(slot(sim, 'coverage'), 'reversed')
> sqrt(sum((rev_pred - rev_true) ^ 2) / length(rev_true))
[1] 1.987844
Is this expected?
aln
is
defined. So if the user looked up this man page they would only see alncov <- calc_coverage(aln)
under the examples. This is not helpful to the user. You
should show all the code here.@
or slot()
. Accessor
methods should be created and utilized.Best, Kayla
Hi, @Kayla-Morrell, thank you for checking our package, we will update the package according to your comments as soon as possible. I'll let you know after we have corrected all requirements. Thanks.
General package development
- [x] SUGGESTION: I would also add to the installation instructions how the user can install CircSeqAlignTk from Biocoductor.
Of course; we will change the installation instructions to using BiocManager::install
after the acceptance.
DESCRIPTION
- [x] SUGGESTION: The Description field in the DESCRIPTION is made up by less than 3 sentences. Please consider expanding this field, and structure it as a full paragraph.
Done.
- [x] SUGGESTION: Consider adding the automatically suggested biocView: SplicedAlignment.
The processes performed by this package don't relate to the spliced alignment.
NEWS
- [x] REQUIRED: It is great to include a NEWS file to keep track of changes within the package. If you are going to include this file please populate some information so we can be sure it will be formatted properly.
Done.
Data
- [x] REQUIRED: We require all data within
inst/extdata
to have corresponding documentation within aninst/script
directory. The scripts in this directory can vary. But most importantly they should clearly document how the data ininst/extdata
was generated and source information. It should include source URLs and any key information regarding filtering or processing. It can be executable code, sudo code, or a text description. Users should be able to download and be able to roughly reproduce the file or object that is present as data.
Done.
Vignette
- [x] REQUIRED: Most, if not all, code blocks in the vignette need to be executable. This means that there should be as little
eval=FALSE
usage as possible.
We removed almost chunks that specified eval=FALSE
. Now, only the necessary chunks with eval=FALSE
are remined.
- [x] REQUIRED: Section 3.4 - Summarization and visualization of alignment results I have a different results for:
> head(slot(alncov, 'forward')) L21 L22 L23 L24 [1,] 10 8 1 1 [2,] 10 8 1 1 [3,] 10 8 1 1 [4,] 10 9 1 1 [5,] 10 9 1 1 [6,] 10 9 1 1
Is this expected? Since this is different for me, my plots in this section end up being a bit different than the ones I see in the vignette.
It is fine that the results are not the same, since the result depends on the version of Bowtie2 (or Rbowtie2).
- [x] REQUIRED: Section 4.1 - Generation of synthetic sequence reads I have a different result for:
> head(slot(sim, 'peak')) mean std strand prob 1 53 4 - 0.2266771620 2 239 3 + 0.1221685263 3 307 2 - 0.0004017685 4 343 3 - 0.1309113862 5 237 5 - 0.0936213551 6 244 1 + 0.2162145535 >
Is this expected? It seems the rest of my results for this section are different then what I see in the vignette.
We updated the scripts in the vignette and checked the macOS and Ubuntu (Docker) output with the same result now. If you still have a different result, would you tell me your OS and R version, please? Thanks.
- [x] REQUIRED: Section 4.2 - Examples of sequence read generation with additional parameters there is an extra
)
at the end of the sequence when defininggenome_seq
. If the user copies this code and pastes into R it will result in an error.
Fixed.
- [x] REQUIRED: At the end of Section 4.3 - Performance evaluation with the synthetic data I get different results for:
> # coverage of reads in forward strand > fwd_pred <- slot(alncov, 'forward') > fwd_true <- slot(slot(sim, 'coverage'), 'forward') > sqrt(sum((fwd_pred - fwd_true) ^ 2) / length(fwd_true)) [1] 5.10784
and
> # coverage of reads in reversed strand > rev_pred <- slot(alncov, 'reversed') > rev_true <- slot(slot(sim, 'coverage'), 'reversed') > sqrt(sum((rev_pred - rev_true) ^ 2) / length(rev_true)) [1] 1.987844
Is this expected?
It is fine that the results are not the same, since the result depends on the version of Bowtie2 (or Rbowtie2).
- [x] REQUIRED: Similarly to what you do in the examples, I would have the code that writes files write them to a temporary file. When I ran through your entire vignette locally I ended up with many directories and files that remained after my R session was closed.
We updated the vignette; let the analysis perform under the temporary directory.
Man pages
- [x] REQUIRED: In the man page for 'calc_coverage' you don't show how
aln
is defined. So if the user looked up this man page they would only seealncov <- calc_coverage(aln)
under the examples. This is not helpful to the user. You should show all the code here.
Done.
R code
- [x] REQUIRED: When running R CMD check on the package built tarball, the step "checking examples..." tries to open a webpage that isn't successful. The check doesn't ERROR out but the webpage results in "Safari Can't Connect to the Server" "Safari can't open the page '127.0.0.1:21114/session/Rvig.9d1bb6aba3.html' because Safari can't connect to the server '127.0.0.1'". I'm not sure if this is something that you experience when you run R CMD check locally?
In function documentation (man/CircSeqAlignTk-package.Rd
), we show an example code to call the vignette with browseVignettes
function. It seems that this function starts up the browser automatically. We don't want to add \dontrun
in this case; but is there any recommendation to add \dontrun
to this example?
- [x] REQUIRED: Avoid use of direct slot access with
@
orslot()
. Accessor methods should be created and utilized.
We added get_slot_contents
function to avoid using slot
or @
directly.
- [x] REQUIRED: Avoid 'suppressWarnings' / 'Messages' if possible.
This package internally calls Rbowtie2 or Rhisat2 to perform some processes, and the functions in these package output a lot of log messages. Here, we suppress these messages and warnings for readability, this can let users catch the important message and warning easily. Should we remove 'suppressWarnings' / 'Messages' from the code even if we have suitable reasons?
- [x] SUGGESTION: For formatting reasons, consider shorter lines. There are 59 lines that are > 80 characters long.
We used easy-to-understand variable names; this makes the variable names longer. As a result, some lines are longer than 80 characters. For readability, we will not short these lines, is this OK?
- [x] SUGGESTION: For formatting reasons, consider multiples of 4 spaces for line indents. There are 194 lines that are not.
For readability, we will keep the current coding style, is this OK?
Hi, @Kayla-Morrell
We updated the package according to your comments, would you please check the package again? Many thanks. If there are further comments, please let me know. For example, we didn't remove suppressWarning
according to some reason; but if it still strongly recommends removing it, please let us know. Many thanks.
Hello @jsun - I see that you have pushed the changes to your GitHub but they need to be pushed to our git version of the package. See the comment made here about setting up remote to push to git.bioconductor.org.
Thanks, can I push to git.bioconductor.org at this stage? Because I tried to push the update but I got the permission denied error.
git remote -v
## origin git@github.com:jsun/CircSeqAlignTk.git (fetch)
## origin git@github.com:jsun/CircSeqAlignTk.git (push)
## upstream git@git.bioconductor.org:packages/CircSeqAlignTk.git (fetch)
## upstream git@git.bioconductor.org:packages/CircSeqAlignTk.git (push)
git push upstream main:master
## git@git.bioconductor.org: Permission denied (publickey).
## fatal: Could not read from remote repository.
##
## Please make sure you have the correct access rights
## and the repository exists.
Please activate your GitCredentials account. I will send private email with a temporary password for activation. Once that occurs you can manage your ssh keys for access.
Received a valid push on git.bioconductor.org; starting a build for commit id: 62a18d337272f44a284f37ffb7ce8b1e0f1cffee
Dear Package contributor,
This is the automated single package builder at bioconductor.org.
Your package has been built on Linux, Mac, and Windows.
Congratulations! The package built without errors or warnings on all platforms.
Please see the build report for more details. This link will be active for 21 days.
Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
git@git.bioconductor.org:packages/CircSeqAlignTk
to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.
Hi, @Kayla-Morrell , thanks to @lshep 's helping, it seems that I successfully pushed the update to BioC git.
@jsun - Thank you for making the necessary changes. I have looked everything over and it all looks good. I'm more than happy to accept the package.
Best, Kayla
Your package has been accepted. It will be added to the Bioconductor nightly builds.
Thank you for contributing to Bioconductor!
Reviewers for Bioconductor packages are volunteers from the Bioconductor community. If you are interested in becoming a Bioconductor package reviewer, please see Reviewers Expectations.
The master branch of your GitHub repository has been added to Bioconductor's git repository.
To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/jsun.keys is not empty), then no further steps are required. Otherwise, do the following:
See further instructions at
https://bioconductor.org/developers/how-to/git/
for working with this repository. See especially
https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/
to keep your GitHub and Bioconductor repositories in sync.
Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at
https://bioconductor.org/checkResults/
(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("CircSeqAlignTk")
. The package 'landing page' will be created at
https://bioconductor.org/packages/CircSeqAlignTk
If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.
Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor
Confirm the following by editing each check box to '[x]'
[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand Bioconductor Package Naming Policy and acknowledge Bioconductor may retain use of package name.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.
[x] I am familiar with the Bioconductor code of conduct and agree to abide by it.
I am familiar with the essential aspects of Bioconductor software management, including:
For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.