Bioconductor / Contributions

Contribute Packages to Bioconductor
131 stars 33 forks source link

GrafGen #3324

Closed wheelerb closed 3 months ago

wheelerb commented 4 months ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot commented 4 months ago

Hi @wheelerb

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: GrafGen
Title: Classification of Helicobacter Pylori Genomes
Version: 0.99.0
Date: 2024-02-28
Authors@R: c(person(given="William", family="Wheeler", role=c("aut", "cre"),
      email="wheelerb@imsweb.com"),
  person(given="Difei", family="Wang", role="aut",
      email="difei.wang2@nih.gov"),
  person(given="Isaac", family="Zhao", role="aut"),
  person(given="Yumi", family="Jin", role="aut"),
  person(given="Charles", family="Rabkin", role="aut", 
      email="rabkinc@exchange.nih.gov"))
Imports: stats, utils, graphics, ggplot2, plotly, zlibbioc, scales,
        RColorBrewer, dplyr, grDevices, GenomicRanges, shiny, cowplot,
        ggpubr, stringr, rlang
Depends: R (>= 4.3.0)
Suggests: knitr, rmarkdown, RUnit, BiocManager, BiocGenerics,
        BiocStyle, devtools
Description: To classify Helicobacter pylori genomes according to genetic 
    distance from nine reference populations. The nine reference 
    populations are hpgpAfrica, hpgpAfrica-distant, hpgpAfroamerica, 
    hpgpEuroamerica, hpgpMediterranea, hpgpEurope, hpgpEurasia,
    hpgpAsia, and hpgpAklavik86-like.
    The vertex populations are Africa, Europe and Asia.
License: GPL-2
biocViews: Genetics, Software, GenomeAnnotation, Classification
NeedsCompilation: yes
BugReports: https://github.com/wheelerb/GrafGen/issues
VignetteBuilder: knitr
Packaged: 2024-02-28 21:13:45 UTC; wheelerwi
Author: William Wheeler [aut, cre],
  Difei Wang [aut],
  Isaac Zhao [aut],
  Yumi Jin [aut],
  Charles Rabkin [aut]
Maintainer: William Wheeler <wheelerb@imsweb.com>
vjcitn commented 3 months ago

thanks for this submission. please don't put python script in exec. you can use basilisk with reticulate::import_from_path to get the symbols you need with inst/python used to store your python code. basilisk will allow you to specify all non-builtins with versions to be acquired either from pip or conda channels. thanks

wheelerb commented 3 months ago

Thanks for looking at the package. Using basilisk seems very complicated, and the python script is a minor, non-essential component of the R package. I may have to remove the python script.

wheelerb commented 3 months ago

I have updated the package by removing the python script.

lshep commented 3 months ago

It looks like you change a users config with the command config(showTips = FALSE) . It is not recommended to change a users configuration. Is there a reason this is necessary?

wheelerb commented 3 months ago

The config(showTips = FALSE) is in the code to produce interactive plots. I will try leaving it out and see if the plots are any different. Thanks.

wheelerb commented 3 months ago

There was no change in the plots. The config(showTips = FALSE) statement has been removed.

bioc-issue-bot commented 3 months ago

Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): GrafGen_0.99.2.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/GrafGen to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 3 months ago

A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: c313e66cbed7043cfb1019682e48544d9c964da4

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): GrafGen_0.99.4.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/GrafGen to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

hpages commented 3 months ago

Thanks for this submission.

Only a few minor things that should be easy to address:

  1. It's not clear how dataset grafGen_example_results was obtained. According to its man page it's "The returned object from grafGen in the analysis of a subset of the reference data." Please clarify which subset you're referring to exactly. If it's the one included in the package (in the data.vcf.gz file), then the problem is that calling grafGen() on it produces different results:

    > genoFile  <- system.file("extdata", "data.vcf.gz", package="GrafGen")
    > results <- grafGen(genoFile, print=0)
    > results$table[1:3, ]
            Sample N_SNPs    GD1_x    GD2_y     GD3_z F_percent E_percent A_percent
    1 HpGP-ALG-002  35528 1.325330 1.246303 -0.008719     27.79     72.21         0
    2 HpGP-ALG-004  35528 1.355911 1.264769  0.004511     19.76     80.24         0
    3 HpGP-ALG-005  35528 1.350071 1.267337 -0.003531     19.70     80.30         0
      hpgpAfrica hpgpAfrica-distant hpgpAfroamerica hpgpEuroamerica
    1   0.398096           0.661930        0.324232        0.276226
    2   0.429534           0.660384        0.336875        0.279032
    3   0.420432           0.655047        0.331030        0.275570
      hpgpMediterranea hpgpEurope hpgpEurasia hpgpAsia hpgpAklavik86-like
    1         0.277100   0.305868    0.395434 0.577032           0.597266
    2         0.260454   0.289007    0.380532 0.564095           0.587367
    3         0.256573   0.284675    0.379566 0.565708           0.589778
                Refpop Nearest_neighbor Separation_percent
    1  hpgpEuroamerica hpgpMediterranea               0.32
    2 hpgpMediterranea  hpgpEuroamerica               7.13
    3 hpgpMediterranea  hpgpEuroamerica               7.40
    
    > data(grafGen_example_results)
    > grafGen_example_results$table[1:3, ]
            Sample N_SNPs    GD1_x    GD2_y    GD3_z E_percent F_percent A_percent
    2 HpGP-ALG-002 143705 1.314069 1.246175 0.000711     71.12     28.88         0
    4 HpGP-ALG-004 143705 1.342173 1.262621 0.005026     78.29     21.71         0
    5 HpGP-ALG-005 143705 1.343746 1.266677 0.000658     79.49     20.51         0
      hpgpEurope hpgpAfrica hpgpAsia hpgpAfroamerica hpgpEuroamerica
    2   0.310315   0.397851 0.593879        0.325261        0.278219
    4   0.292363   0.420089 0.577689        0.333375        0.279396
    5   0.286959   0.418200 0.575399        0.330121        0.275956
      hpgpMediterranea hpgpEurasia hpgpAklavik86-like hpgpAfrica-distant
    2         0.279297    0.403331           0.608942           0.668793
    4         0.259688    0.386691           0.599314           0.660547
    5         0.255563    0.384445           0.599825           0.659308
                Refpop Nearest_neighbor Separation_percent
    2  hpgpEuroamerica hpgpMediterranea               0.39
    4 hpgpMediterranea  hpgpEuroamerica               7.59
    5 hpgpMediterranea  hpgpEuroamerica               7.98

    Please make sure that the other included datasets (grafGen_reference_results and grafGen_reference_dataframe) also match the actual results produced by GrafGen when running it on the full reference data (1011 genomes).

  2. Would actually be nice to provide the link to the full reference data in the man page for grafGen_reference_results so that people can try to run GrafGen on it if they want.

  3. Use file.path(dir, "data.vcf.gz") instead of paste0(dir, .Platform$file.sep, "data.vcf.gz") to construct a file path.

  4. Use inherits(f, "try-error") or !inherits(x, "grafpop") instead of "try-error" %in% class(f) or !("grafpop" %in% class(x)).

  5. Do not include the word "error" or "ERROR" in your error messages. The stop() function already does that for you so the message displayed to the end user contains twice the word "error":

    > stop("ERROR: invalid input")
    Error: ERROR: invalid input
  6. No need to use paste() or paste0() to put together an error message because stop() also takes care of that for you. So instead of:

    stop(paste0("ERROR: id must be a column in ", nm))

    just do:

    stop("id must be a column in ", nm)
  7. Lots of internal functions that are not used anywhere (dead code): checkFiles, check_out.file, checkForSep, check.dir, check.list, check_options, default.list, pgo_interactive, etc... A package is easier to maintain in the long run if it only contains what's actually needed and used.

  8. You also have internal utilities that are defined more than once (e.g. check_numVec) so you might want to spend a little bit of time cleaning your R code.

Thanks, H.

hpages commented 3 months ago

and also:

  1. The Shiny app doesn't work for me. Trying to run the code in the vignette:
    ...
    tmp <- createApp(ret)
    reference_results <- tmp$reference_results
    user_results      <- tmp$user_results
    user_metadata     <- tmp$user_metadata
    shiny::runApp(tmp$app)

    produces the following error:

    Listening on http://127.0.0.1:7828
    Warning: Error in get: object 'grafGen_reference_dataframe' not found
      39: get
      38: server
       1: shiny::runApp
    Error in get("grafGen_reference_dataframe") : 
      object 'grafGen_reference_dataframe' not found

Best, H.

wheelerb commented 3 months ago

Thank you for the review. I will take care of these issues.

bioc-issue-bot commented 3 months ago

Received a valid push on git.bioconductor.org; starting a build for commit id: e7c0db1939f26daa375fbb27145a631c4b2234a4

bioc-issue-bot commented 3 months ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder: Linux (Ubuntu 22.04.3 LTS): GrafGen_0.99.5.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/GrafGen to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

wheelerb commented 3 months ago

I believe all of the issues have now been taken care of. Thank you. Bill

hpages commented 3 months ago

Thanks! :rocket:

bioc-issue-bot commented 3 months ago

Your package has been accepted. It will be added to the Bioconductor nightly builds.

Thank you for contributing to Bioconductor!

Reviewers for Bioconductor packages are volunteers from the Bioconductor community. If you are interested in becoming a Bioconductor package reviewer, please see Reviewers Expectations.

lshep commented 3 months ago

The default branch of your GitHub repository has been added to Bioconductor's git repository as branch devel.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/wheelerb.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("GrafGen"). The package 'landing page' will be created at

https://bioconductor.org/packages/GrafGen

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.