Bioconductor / Contributions

Contribute Packages to Bioconductor
134 stars 33 forks source link

metabinR #2796

Closed gkanogiannis closed 1 year ago

gkanogiannis commented 2 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot commented 2 years ago

Hi @gkanogiannis

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: metabinR
Type: Package
Title: Abundance and Compositional Based Binning of Metagenomes
Version: 0.99.0
biocViews: Classification, Clustering, Microbiome, Sequencing, Software
Authors@R:
    c(person(given = "Anestis",
   family = "Gkanogiannis",
   role = c("aut", "cre"),
   email = "anestis@gkanogiannis.com",
   comment = c(ORCID = "0000-0002-6441-0688"))
    )
Description: Provide functions for performing abundance and compositional based 
    binning on metagenomic samples, directly from FASTA or FASTQ files.
    Functions are implemented in Java and called via rJava.
    Parallel implementation that operates directly on input FASTA/FASTQ files 
    for fast execution.
License: GPL-3
Encoding: UTF-8
Language: en-US
LazyData: false
Depends:
    R (>= 4.2)
Imports: 
    methods,
    rJava,
    utils
SystemRequirements: Java (>= 8)
RoxygenNote: 7.2.1
URL: https://github.com/gkanogiannis/metabinR
BugReports: https://github.com/gkanogiannis/metabinR/issues
Suggests: 
    BiocStyle,
    cvms,
    data.table,
    ggplot2,
    gridExtra,
    knitr,
    rmarkdown,
    sabre,
    spelling,
    testthat (>= 3.0.0)
VignetteBuilder: knitr
Config/testthat/edition: 3
vjcitn commented 2 years ago
reads.mapping <- fread(
        system.file("extdata", "reads_mapping.tsv.gz",package = "metabinR")
    )
reads.mapping$AB_id <- abundances$V3[
                                        match(reads.mapping$genome_id,
                                        abundances$V1)
                                    ]
reads.mapping <- reads.mapping[order(reads.mapping$anonymous_read_id),]

this is from vignette. Syntax and manipulations are complex. Can you define higher-level data structures and methods to make this easier for the user?

vjcitn commented 2 years ago

What if the abundances$V1 don't match any reads.mapping$genome_id ... ? Of course it works in your example but we want some defensive programming to help users deal with possible inconsistencies between independently managed resources. This is one of the motivations behind SummarizedExperiment, in which sample-level data and quantifications can have all kinds of identifiers, and we want to hand the user something that has some validity checking on construction.

vjcitn commented 2 years ago

pix

is that really what we want in the vignette?

vjcitn commented 2 years ago

Why not just print the table?

vjcitn commented 2 years ago

Thanks for the submission, let us know your plans.

gkanogiannis commented 2 years ago
reads.mapping <- fread(
        system.file("extdata", "reads_mapping.tsv.gz",package = "metabinR")
    )
reads.mapping$AB_id <- abundances$V3[
                                        match(reads.mapping$genome_id,
                                        abundances$V1)
                                    ]
reads.mapping <- reads.mapping[order(reads.mapping$anonymous_read_id),]

this is from vignette. Syntax and manipulations are complex. Can you define higher-level data structures and methods to make this easier for the user?

I updated vignette to use dplyr-style operation for merging and ordering. I hope it looks better now.

gkanogiannis commented 2 years ago

What if the abundances$V1 don't match any reads.mapping$genome_id ... ? Of course it works in your example but we want some defensive programming to help users deal with possible inconsistencies between independently managed resources. This is one of the motivations behind SummarizedExperiment, in which sample-level data and quantifications can have all kinds of identifiers, and we want to hand the user something that has some validity checking on construction.

The purpose of metabinR is to perform 3 types of binning (AB, CB and HC). It takes as input a set of fasta/fastq files and creates bins of reads. The evaluation of the generated bins of reads is beyond the scope of this package and no mechanism or function is offered for this purpose.

For demonstration only, a simple way to perform evaluation of the bins is given in the vignette.

Example input fasta files are generated by reads simulator CAMISIM and simple evaluation is done using the design files generated by it (for example the names of the fields of the produced distribution.txt and reads_mapping.tsv).

In the vignette example reads.mapping$genome_id always matches abundances$genome_id because it was fixed from the design. Users of metabinR are expected to use their own way of evaluation of the produced reads bins.

gkanogiannis commented 2 years ago

Why not just print the table?

Apologies for the ugliness. Changed it to plot it as table with knitr::kable.

gkanogiannis commented 2 years ago

Dear @vjcitn , thank you for the pre-review comments. Please see my replies above. Pushed updated version 0.99.1 to github repo. Looking forward for the continuation of the review.

vjcitn commented 2 years ago

You've improved the aesthetics somewhat but I still think defensive programming and integrative datastructures like SummarizedExperiment or TreeSummarizedExperiment should be considered.

gkanogiannis commented 2 years ago

You've improved the aesthetics somewhat but I still think defensive programming and integrative datastructures like SummarizedExperiment or TreeSummarizedExperiment should be considered.

Dear @vjcitn it is something that I will for sure consider as improvement. Thank you.

bioc-issue-bot commented 2 years ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 2 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/metabinR to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

gkanogiannis commented 1 year ago

Dear @hpages salut. J'espere que tout va bien. I was wondering if we are on track here and if I am going to receive soon an initial review of metabinR, in order to have some time until the deadline of 26 October for corrections and improvements based on your comments.

Looking forward to receiving your review, Anestis

hpages commented 1 year ago

Hi @gkanogiannis Tout va bien merci. Sorry for the slow response. I'm taking a look at metabinR and will come back with some feedback. Thanks for you patience.

gkanogiannis commented 1 year ago

Dear Hervé @hpages, thank you so much for your time and effort.

gkanogiannis commented 1 year ago

Dear @hpages, I was wondering if it will be possible to receive a review and feedback on this package. I am worried that it will not make it to the manifest of new release. Thank you again so much for tour efforts.

hpages commented 1 year ago

Thanks for your patience @gkanogiannis.

Package looks good. Only minor cosmetic issue is this:

> library(metabinR)
To cite metabinR in publications use:

A BibTeX entry for LaTeX users is

  @Article{,
    title = {A scalable assembly-free variable selection algorithm for biomarker discovery from metagenomes},
    author = {Gkanogiannis Anestis and Thomas Bruls},
    journal = {BMC Bioinformatics},
    year = {2016},
    volume = {Aug 19;17(1):311},
    doi = {10.1186/s12859-016-1186-3},
    url = {https://dx.doi.org/10.1186/s12859-016-1186-3},
  }

This is not considered good practice. Please remove. Typical Bioconductor workflows will start by loading many packages, sometimes dozens or more. Bioconductor recommends packages to be as quiet as possible at load time.

Thanks

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 6f76dd3e54be756409cc66839b4dc1807eb49461

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/metabinR to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

gkanogiannis commented 1 year ago

Dear @hpages, citation message on package attach is removed and version bumped. Thank you once again for your time and effort in reviewing metabinR.

hpages commented 1 year ago

All looks good. Thanks!

bioc-issue-bot commented 1 year ago

Your package has been accepted. It will be added to the Bioconductor nightly builds.

Thank you for contributing to Bioconductor!

Reviewers for Bioconductor packages are volunteers from the Bioconductor community. If you are interested in becoming a Bioconductor package reviewer, please see Reviewers Expectations.

gkanogiannis commented 1 year ago

Merci beaucoup, bonne soirée Hervé @hpages !

lshep commented 1 year ago

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/gkanogiannis.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("metabinR"). The package 'landing page' will be created at

https://bioconductor.org/packages/metabinR

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.