Bioconductor / Contributions

Contribute Packages to Bioconductor
131 stars 33 forks source link

Damsel #3325

Closed caitlinpage closed 2 months ago

caitlinpage commented 3 months ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot commented 3 months ago

Hi @caitlinpage

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: Damsel
Type: Package
Title: Damsel: an end to end analysis of DamID
Version: 0.99.0
Authors@R: person("Caitlin", "Page", email = "caitlin.page@petermac.org", role = c("aut", "cre"))
Description: Damsel provides an end to end analysis of DamID data.
    Damsel takes bam files from Dam-only control and fusion samples and counts the reads matching to each GATC region. edgeR is utilised to identify regions of enrichment in the fusion relative to the control. Enriched regions are combined into peaks, and are associated with nearby genes.
    Damsel allows for IGV style plots to be built as the results build, inspired by ggcoverage, and using the functionality and layering ability of ggplot2.
    Damsel also conducts gene ontology testing with bias correction through goseq, and future versions of Damsel will also incorporate motif enrichment analysis.
    Overall, Damsel is the first package allowing for an end to end analysis with visual capabilities. The goal of Damsel was to bring all the analysis into one place, and allow for exploratory analysis within R.
License: MIT + file LICENSE
Encoding: UTF-8
Suggests: 
    biovizBase,
    biomaRt,
    BSgenome.Dmelanogaster.UCSC.dm6,
    knitr,
    limma,
    org.Dm.eg.db,
    rmarkdown,
    testthat (>= 3.0.0),
    TxDb.Dmelanogaster.UCSC.dm6.ensGene
Config/testthat/edition: 3
Depends: 
    R (>= 3.50)
RoxygenNote: 7.3.1
Imports: 
    AnnotationDbi,
    Biostrings,
    dplyr,
    edgeR,
    GenomeInfoDb,
    GenomicFeatures,
    GenomicRanges,
    ggbio,
    ggplot2,
    goseq,
    magrittr,
    patchwork,
    plyranges,
    reshape2,
    rlang,
    Rsamtools,
    Rsubread,
    stats,
    stringr,
    tidyr,
    utils
BugReports: https://github.com/Oshlack/Damsel
URL: https://github.com/Oshlack/Damsel
biocViews: DifferentialMethylation, PeakDetection, GenePrediction, GeneSetEnrichment
VignetteBuilder: knitr
LazyData: FALSE
lshep commented 3 months ago

In the vignette, please hide the first library(Damsel) as it is currently shown as a random code piece before the vignette starts. Also, please change the vignette installations to not include a force=TRUE, this will force a re-installation even if the user has the package already installed (and cost Bioconductor egress cost). The correct installation code should be

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("Damsel")
library(Damsel)
caitlinpage commented 3 months ago

Thank you for the feedback. I have now updated the code to show the correct installation.

bioc-issue-bot commented 3 months ago

Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): Damsel_0.99.0.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/Damsel to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: ba37341a46465a520603c0e1ec90fe8c036a539c

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): Damsel_0.99.1.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/Damsel to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.

jianhong commented 3 months ago

Package 'Damsel' Review

Thank you for submitting your package to Bioconductor. The package passed check and build. However there are several things need to be fixed. Please try to answer the comments line by line when you are ready for a second review. Code: Note: please consider; Important: must be addressed.

The DESCRIPTION file

Documentation

bioc-issue-bot commented 2 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: a3d6fa9ac1627de7893ae49b988e9ee5b9e0a6b9

bioc-issue-bot commented 2 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): Damsel_0.99.2.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/Damsel to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

caitlinpage commented 2 months ago

Hi @jianhong thank you for the feedback. I've made the following changes:

The DESCRIPTION file

  • [x] Important: R version should be no less than 4.4
  • [x] NOTE: Consider adding the maintainer's ORCID iD in 'Authors@R' with 'comment=c(ORCID="...")'

The NAMESPACE file

  • [x] Important: Function names use camelCase or snake_case and do not include ..

    • in line 13 export(geom_genes.tx) - The function is now geom_genes_tx

unacceptable files

  • [x] Important: unacceptable files present (see .gitignore for a listing).

    • ./tests/testthat/Rplots.pdf - I've added the file to .git.ignore

R code

  • [x] Important:is() or inherits() instead of class(). -I've replaced it with inherits()

  • [ ] NOTE: :: is not suggested in source code unless you can make sure all the packages are imported. Some people think it is better to keep ::. However please note that you need to manully double check the import items when you make any change in the DESCRIPTION file during development. My recommendation is to remove one or two repeats to force the dependency check. I don't understand your recommendation to remove one or two repeats to force the dependency check.

  • [ ] NOTE: Vectorize: for loops present, try to replace them by *apply funcitons. I've replaced the loops in the files R/diff_meth_edgeR, R/process_bams. I would prefer to keep the loops in R/helpers, R/peak_calling and R/plot_wrap because I don't know how to vectorise them and gain the same result - particuarly for the loops in R/peak_calling - where the result of the previous very much informs the next result.

  • [x] Important: Use file.path to replace paste File path has replaced paste

  • [x] Important: Remove unused code. All the unused code has been removed

  • [x] NOTE: Try to check the edge condition when using seq.int or seq_len. For example using seq.int(min(5, nrow(data))) to replace seq.int(5) I've followed this recommendation in R/gene_ontology and fixed this in R/gene_annotation. However, I would like to keep this as it is in R/peak_calling as there is a check in the previous function which acts to check the edge case - this loop is skipped if the edge case occurs.

  • [ ] NOTE: Functional programming: code repetition.

    • repetition in ..addDM, and identifyPeaks -fixed by removing the unnecessary code in ..addDM

    • repetition in ..checkForGaps, and ..overlapGaps I would prefer to keep this repetition as it is just the column names and it is easier to read the code with the names there than have to look for the helper functions code.

    • repetition in ..dmReshape, and ..plotCountsReshape -Fixed by moving the code into another function - ..regionRectangle

    • repetition in ..dmTheme, ..peakGatcTheme, and ..themeGenePlot I would prefer to keep this repetition to avoid having too many helper functions within the plots and getting confused

    • repetition in ..plotPeak, and ggplot_add.genes.tx I would prefer to keep this repetition so that the code in the individual functions is clear and shows that it will handle having no data

    • repetition in ggplot_add.dm, ggplot_add.gatc, ggplot_add.genes.tx, and ggplot_add.peak I would prefer to keep this repetition so that each function shows how the data comes in and is processed. Also, as it is used to define various parameters used within the plot, it would be impractical to move this into a helper function - as it wouldn't define the parameters.

Documentation

  • [x] Note: Vignette should use BiocStyle package for formatting.

    • rmd file vignettes/Damsel-workflow.Rmd
  • [x] To avoid the '* NOTE: Consider adding runnable examples to man pages that document exported objects.' in BiocCheck, could you please add runnable examples to
    • pipe.Rd

Thank you for your feedback, I look forward to your response.

bioc-issue-bot commented 2 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 1a1a9b53918f79e8e70de2aef854d9e7cb3856bf

bioc-issue-bot commented 2 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): Damsel_0.99.3.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/Damsel to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

jianhong commented 2 months ago

Package 'Damsel' Review

In current review, something about reuse the Bioconductor functions. Please try to answer the comments line by line when you are ready for a second review. Code: Note: please consider; Important: must be addressed.

unacceptable files

R code

bioc-issue-bot commented 2 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 8b9627ea100f8bf41208ef4a4adf085fedf21058

bioc-issue-bot commented 2 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): Damsel_0.99.4.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/Damsel to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

caitlinpage commented 2 months ago

Hi @jianhong thank you for your review.

Package 'Damsel' Review

In current review, something about reuse the Bioconductor functions. Please try to answer the comments line by line when you are ready for a second review. Code: Note: please consider; Important: must be addressed.

unacceptable files

  • [x] Important: please remove the file

    • ./tests/testthat/Rplots.pdf This file has now been removed

      General package development

  • [] Note: Consider adding these automatically suggested biocViews: MotifAnnotation Damsel conducts gene ontology testing but does not actually do motif analysis, so I would not like to add the MotifAnnotation BiocView term.

    R code

  • [ ] Note: try to understand more usage cases for is() and inherits(). I believe the extra checks I have made help with this goal, and this is something I will continue to learn and build on as my coding experience grows.
  • [x] NOTE: try to fix the seqnames with GenomeInfoDb::seqlevelsStyle I have fixed some of these cases with a new function I made, in particular in areas that make sense within the code. There are a few I would like to leave as is, because they have been set earlier in the function with this new method, and in order to change them again in GenomeInfoDb I would need to change the columns and then rearrange them after to have the settings I like.
  • [x] Important: Please consider to add drop=FALSE to avoid the reduction of dimension for matrices and arrays. Ignore this if using datatable.

    • In file R/gatc_track.R:

    • at line 50 found ' df <- df[, c("Position", "seqnames", "start", "end", "width")]'

    • In file R/gene_annotation.R:

    • at line 30 found ' genes <- data.frame(genes)[, c("seqnames", "start", "end", "width", "strand", "ensembl_gene_id")]'

    • at line 144 found ' n_regions <- combo[, c("seqnames", "start", "end", "width", "num")]'

    • at line 275 found ' df <- df[, col_order]'

    • In file R/helpers.R:

    • at line 21 found ' df[, c("Position", "seqnames", "start", "end", "width", "strand")]'

    • In file R/plot_helper_fns.R:

    • at line 10 found ' df.select <- df[df$end >= start & df$start <= end, ]'

    • In file R/process_bams.R:

    • at line 36 found ' regions <- regions[,c("Position", "seqnames", "start", "end", "width", "strand")]'

  • [x] Important: Additional input check should be involved in makeDGE for counts.df, countBamInGATC for regions. I have made additional checks for some of the columns used by those functions.

    Documentation

  • [x] Note: Typos: paramter identifyPeaks.Rd:21

Thank you again for your time and your feedback, I look forward to your response.

bioc-issue-bot commented 2 months ago

Your package has been accepted. It will be added to the Bioconductor nightly builds.

Thank you for contributing to Bioconductor!

Reviewers for Bioconductor packages are volunteers from the Bioconductor community. If you are interested in becoming a Bioconductor package reviewer, please see Reviewers Expectations.

lshep commented 2 months ago

The default branch of your GitHub repository has been added to Bioconductor's git repository as branch devel.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/caitlinpage.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("Damsel"). The package 'landing page' will be created at

https://bioconductor.org/packages/Damsel

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.