Bioconductor / Contributions

Contribute Packages to Bioconductor
135 stars 33 forks source link

pgxRpi #3120

Closed hangjiaz closed 1 year ago

hangjiaz commented 1 year ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot commented 1 year ago

Hi @hangjiaz

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: pgxRpi
Title: R wrapper for Progenetix 
Version: 0.99.0
Authors@R: 
    c(person(given = "Hangjia",
 family = "Zhao",
 role = c("aut", "cre"),
 email = "hangjia.zhao@uzh.ch",
 comment = c(ORCID = "0000-0001-8376-5751")),
       person(given = "Michael",
 family = "Baudis",
 role = c("aut"),
 email = "michael.baudis@mls.uzh.ch",
 comment = c(ORCID = "0000-0002-9903-4248")))
Description: The package is an R wrapper for Progenetix REST API built upon the Beacon v2 protocol. Its purpose is to provide a seamless way for retrieving genomic data from Progenetix database—an open resource dedicated to curated oncogenomic profiles. Empowered by this package, users can effortlessly access and visualize data from Progenetix.
biocViews: CopyNumberVariation, GenomicVariation, DataImport, Software
License: Artistic-2.0
Encoding: UTF-8
LazyData: FALSE
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.2
Imports:
    utils,
    methods,
    grDevices,
    graphics,
    circlize,
    rjson,
    dplyr,
    attempt,
    lubridate,
    survival,
    survminer,
    ggplot2,
    plyr
Depends: 
    R (>= 4.2)
Suggests: 
    BiocStyle,
    rmarkdown,
    knitr,
    testthat,
    httr,
    RColorBrewer,
    pheatmap
BugReports: https://github.com/progenetix/pgxRpi/issues
URL: https://github.com/progenetix/pgxRpi
VignetteBuilder: rmarkdown
vjcitn commented 1 year ago

Please explain in your vignette what "progenetix" is. Is it a versioned data resource? Please imagine a newcomer coming to your package, reading the vignettes to find inspiration to learn about your tool.

hangjiaz commented 1 year ago

Hi @vjcitn,

Thanks for your feedback. I have updated the vignettes as suggested.

bioc-issue-bot commented 1 year ago

Your package has been added to git.bioconductor.org to continue the pre-review process. A build report will be posted shortly. Please fix any ERROR and WARNING in the build report before a reviewer is assigned or provide a justification on why you feel the ERROR or WARNING should be granted an exception.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. All changes should be pushed to git.bioconductor.org moving forward. It is required to push a version bump to git.bioconductor.org to trigger a new build report.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: macOS 12.6.5 Monterey: pgxRpi_0.99.0.tar.gz Linux (Ubuntu 22.04.2 LTS): pgxRpi_0.99.0.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/pgxRpi to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

A reviewer has been assigned to your package for an indepth review. Please respond accordingly to any further comments from the reviewer.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 1ba9b22e0447313c481554d630b9d0616f0133de

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: macOS 12.6.5 Monterey: pgxRpi_0.99.1.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/pgxRpi to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 0dfa02e1969f1abad6c93355da2603e63641166e

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: macOS 12.6.5 Monterey: pgxRpi_0.99.2.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/pgxRpi to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: c086652662e4a250f7af53d2c9e625eae5a598da

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: macOS 12.6.5 Monterey: pgxRpi_0.99.3.tar.gz Linux (Ubuntu 22.04.2 LTS): pgxRpi_0.99.3.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/pgxRpi to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 1b156cc5afc7c894101e05b578d92d30c76d4c1b

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: macOS 12.6.5 Monterey: pgxRpi_0.99.4.tar.gz Linux (Ubuntu 22.04.2 LTS): pgxRpi_0.99.4.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/pgxRpi to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

LiNk-NY commented 1 year ago

Hi @hangjiaz

Thank you for your submission. After a brief review, there were some major issues with the package. Please see the notes below.

Best, Marcel


pgxRpi

DESCRIPTION

NAMESPACE

vignettes

R

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 1feb432f1c09dbf1b17ee69945b7bacbbd0b4860

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: Linux (Ubuntu 22.04.2 LTS): pgxRpi_0.99.5.tar.gz macOS 12.6.5 Monterey: pgxRpi_0.99.5.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/pgxRpi to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

hangjiaz commented 1 year ago

Hi Marcel @LiNk-NY,

Thanks very much for your review and valuable suggestions! The point-to-point response is below:

pgxRpi

  • The package does not make use of Bioconductor infrastructure (e.g., GRanges, RaggedExperiment, etc.) and copies code from the copynumber package. Has this been cleared with the maintainer of copynumber?
  • In the latest version, the main change is in the data structure of CNV frequency data, which consists of CNV frequency values and associated metadata. Originally this data was a list with two slots "data" and "meta". Following suggestions, now we store "freq_pgxfreq" using the GRangesList infrastructure and store "freq_pgxmatrix" using the RangedSummarizedExperiment container based on their different data formats.
  • The code in “helper_pgxFreqplot.R” is taken from the copynumber package with minor modifications. Since copynumber uses the Artistic 2.0 license, based on the rule I queried, I could distribute the modified version as source, provided that I clearly document how it differs from the standard version and use the original license. pgxRpi uses the same license as copynumebr (Artistic 2.0) and I have placed the reference and modification notice at the top of the relevant script “helper_pgxFreqplot.R”. However, if this approach is not in accordance with proper licensing or if there are any concerns, kindly inform me, and I will make the necessary adjustments.

DESCRIPTION

  • Wrap Description to 80 line width. Changed

    NAMESPACE

  • Minimal number of exports, looks good.

vignettes

  • How can users discover filters, sample and individual identifiers? Added description of concepts for filters, sample, and individual identifiers and how to discover them in vignette "Introduction_1_loadmetadata".
  • Is the package's main aim to provide an API to the data? There are examples of analyses and plotting which may be more appropriate in another package(s). Agreed and removed survival analysis in vignette "Introduction_1_loadmetadata" and CNV frequency clustering analysis in vignette "Introduction_3_loadfrequency".
  • The cnv_matrix has combined data in the column names, is there a way to resolve this and have uniform names / data? For example, some columns are X.67000000.68000000.DEL and others are X1.1400000.2400000.DUP. The column name of cnv_matrix follows the format [chr].[start].[end].[cnv type]. Importing the data into R causes an "X" prefix before autosomes because they are numbers, which can be misleading. To fix this, I replaced the "X" prefix with "chr". For example, the columns are chrX.67000000.68000000.DEL and chr1.1400000.2400000.DUP now.
  • It looks like there are multiple copies of the data in freq_pgxfreq. Consider only providing either both "NCIT:C4038", "pgx:icdom-85003" or just "total".
  • Why not use a Bioconductor representation for freq_pgxfreq such as GRanges or RaggedExperiment? The new "freq_pgxfreq" is a GRangeslist containing multiple GRanges objects for each filer, and the "total" data has been removed. The corresponding vignette has been changed accordingly.

    R

  • Remove copy-and-append code in transform_id. It seems that you simply want to do paste(..., collapse = ","). Fixed, I admit the original code is stupid.
  • Rename the data to have more specific names (e.g., hg19 is too broad). I renamed the data from "hg19" "hg38" to "hg19_cytoband" "hg38_cytoband".
  • It looks like there is code copied and pasted from the copynumber package. I restructured the previous reference announcement and added details about which functions I changed.
  • Reduce code repetition by creating helper functions, e.g., for obtaining remain.id from multiple URLs. This will also reduce the cyclomatic complexity of the software. Agreed, I created a disease_code_check helper function to reduce code repetition for the pgidCheck function.

Please let me know if there are further changes that need to be made.

Best, Hangjia

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 4961fe6f85e54f1bed6c3f64c3448b2cd45e8f75

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: macOS 12.6.5 Monterey: pgxRpi_0.99.6.tar.gz Linux (Ubuntu 22.04.2 LTS): pgxRpi_0.99.6.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/pgxRpi to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

LiNk-NY commented 1 year ago

Hi HangJia, @hangjiaz

Thank you for making those changes. The package is in a better state.

I was suggesting that it would be good to expose the terms to the users in addition to referring them to the API documentation. For example, you could expose the "resources" section in https://progenetix.org/beacon/filtering_terms/ to inform the users what resource IDs are available. With the resource ID, the users can then see a list of available IDs parsed from the linked JSON file, e.g.,:

            {
                "count": 60,
                "id": "cbioportal:acyc_mskcc_2013",
                "label": "",
                "type": "ontologyTerm"
            },
            {
                "count": 253,
                "id": "cbioportal:aml_target_2018_pub",
                "label": "",
                "type": "ontologyTerm"
            }

This would enhance the user's experience and they would spend less time browsing through all the IDs on the website.

Please see GenomicDataCommons::available_fields(x=GenomicDataCommons::cases()) as an example of returning filter IDs. This may also be applicable to other filters such as biosamples etc.

Consider using a more formal API request mechanism, i.e. with the httr package and using httr::GET rather than rjson::fromJSON which would allow for httr::stop_for_status and httr::content calls on the response.

Please use a more simple default argument for filters in pgxLoader e.g.,: c("biosample", "variant","frequency","individual") and select from the options with match.arg. Any argument outside of that would produce an error from match.arg. If possible, reduce the number of if statements in the function.

Note that if (x == TRUE) is the same as if (x) in reference to R/pgxSegprocess.R. If you are expecting alternative values in those variables, consider using isTRUE().

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: ca0b2941be1f8103d89e6474e3096ffb6251dd17

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: macOS 12.6.5 Monterey: pgxRpi_0.99.7.tar.gz Linux (Ubuntu 22.04.2 LTS): pgxRpi_0.99.7.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/pgxRpi to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

hangjiaz commented 1 year ago

Hi Marcel, @LiNk-NY

Thanks very much for your valuable suggestions, which are really helpful! Here is my response:

I was suggesting that it would be good to expose the terms to the users in addition to referring them to the API documentation. For example, you could expose the "resources" section in https://progenetix.org/beacon/filtering_terms/ to inform the users what resource IDs are available. With the resource ID, the users can then see a list of available IDs parsed from the linked JSON file. This would enhance the user's experience and they would spend less time browsing through all the IDs on the website.

Please see GenomicDataCommons::available_fields(x=GenomicDataCommons::cases()) as an example of returning filter IDs. This may also be applicable to other filters such as biosamples etc.

I completely agree with the idea of exposing all available filters to make querying easier. To address this, I've created the pgxFilter function, which allows users to access all filters or specific ones based on a specified prefix. This function, along with the pgxCount function, provides more comprehensive metadata. As for the omission of other identifiers such as  biosample_id, it is because it's part of the biosample metadata returned by the pgxLoader function.

Consider using a more formal API request mechanism, i.e. with the httr package and using httr::GET rather than rjson::fromJSON which would allow for httr::stop_for_status and httr::content calls on the response.

Noted and updated. The package no longer utilizes the rjson package, and I've adjusted the DESCRIPTION accordingly.

Please use a more simple default argument for filters in pgxLoader e.g.,: c("biosample", "variant","frequency","individual") and select from the options with match.arg. Any argument outside of that would produce an error from match.arg. If possible, reduce the number of if statements in the function.

Note that if (x == TRUE) is the same as if (x) in reference to R/pgxSegprocess.R. If you are expecting alternative values in those variables, consider using isTRUE().

I've simplified the code accordingly.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: d51567a563a3ac07cd08d66932025166d02d9028

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: Linux (Ubuntu 22.04.2 LTS): pgxRpi_0.99.8.tar.gz macOS 12.6.5 Monterey: pgxRpi_0.99.8.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/pgxRpi to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

LiNk-NY commented 1 year ago
  • Your suggestion regarding the match.arg function is greatly appreciated. I've simplified the type argument to accept values "b" (biosample), "i" (individual), "v" (variant), or "f" (frequency), and this is now included in documentation.

It's best to leave the argument as type = c("biosample", "variant","frequency","individual") and then use match.arg within the function to select a default (usually the first option)

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: a8c078ecfa8b2363907f16033074c857dde50edf

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Build System.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Bioconductor Build System: Linux (Ubuntu 22.04.2 LTS): pgxRpi_0.99.9.tar.gz macOS 12.6.5 Monterey: pgxRpi_0.99.9.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/pgxRpi to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

hangjiaz commented 1 year ago

Hi Marcel, @LiNk-NY

It's best to leave the argument as type = c("biosample", "variant","frequency","individual") and then use match.arg within the function to select a default (usually the first option)

Thanks for your suggestions, I have changed "pgxLoader.R" and vignettes accordingly.

LiNk-NY commented 1 year ago

Hi HangJia, @hangjiaz

Thank you for making those changes! Your package has been accepted.

Best regards, Marcel

bioc-issue-bot commented 1 year ago

Your package has been accepted. It will be added to the Bioconductor nightly builds.

Thank you for contributing to Bioconductor!

Reviewers for Bioconductor packages are volunteers from the Bioconductor community. If you are interested in becoming a Bioconductor package reviewer, please see Reviewers Expectations.

lshep commented 1 year ago

The default branch of your GitHub repository has been added to Bioconductor's git repository as branch devel.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/hangjiaz.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("pgxRpi"). The package 'landing page' will be created at

https://bioconductor.org/packages/pgxRpi

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.