Bioconductor / Contributions

Contribute Packages to Bioconductor
133 stars 33 forks source link

pathwayPCA #1000

Closed gabrielodom closed 5 years ago

gabrielodom commented 5 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

Comments:

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 5 years ago

Hi @gabrielodom

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: pathwayPCA
Type: Package
Title: Integrative Pathway Analysis with Modern PCA Methodology and Gene Selection
Version: 0.99.1
Authors@R: c(person("Gabriel", "Odom", email = "gabriel.odom@med.miami.edu", role = c("aut","cre")),
  person("James", "Ban", email = "yuguang.ban@med.miami.edu", role = c("aut")),
  person("Lizhong", "Liu", email = "lxl816@miami.edu", role = c("aut")),
  person("Lily", "Wang", email = "lily.wang@med.miami.edu", role = c("aut")),
  person("Steven", "Chen", email = "steven.chen@med.miami.edu", role = c("aut")))
Maintainer: Gabriel Odom <gabriel.odom@med.miami.edu>
Description: Apply the Supervised PCA and Adaptive, Elastic-Net, Sparse PCA
  methods to extract principal components from each pathway. Use these pathway-
  specific principal components as the design matrix relating the response to
  each pathway. Return the model fit statistic p-values, and adjust these values
  for False Discovery Rate. Return a data frame of the pathways sorted by their
  adjusted p-values. This package has corresponding vignettes hosted in the
  ''User Guides'' page of <https://gabrielodom.github.io/pathwayPCA/index.html>,
  and the website for the development information is hosted at
  <https://github.com/gabrielodom/pathwayPCA>.
License: GPL-3
Depends: R (>= 3.5)
Imports:
    lars,
    methods,
    parallel,
    stats,
    survival,
    utils
Suggests:
    knitr,
    reshape2,
    rmarkdown,
    survminer,
    tidyverse,
    circlize
biocViews:
    CopyNumberVariation,
    DNAMethylation,
    GeneExpression,
    SNP,
    Transcription,
    GenePrediction,
    GeneSetEnrichment,
    GeneSignaling,
    GeneTarget,
    GenomeWideAssociation,
    GenomicVariation,
    CellBiology,
    Epigenetics,
    FunctionalGenomics,
    Genetics,
    Lipidomics,
    Metabolomics,
    Proteomics,
    SystemsBiology,
    Transcriptomics,
    Classification,
    DimensionReduction,
    FeatureExtraction,
    PrincipalComponent,
    Regression,
    Survival,
    MultipleComparison,
    Pathways
Encoding: UTF-8
LazyData: true
RoxygenNote: 6.1.1
Collate: 
    'CreatePathwayCollection.R'
    'createClass_OmicsPath.R'
    'createClass_validOmics.R'
    'accessClass_OmicsPath.R'
    'createClass_OmicsSurv.R'
    'accessClass_OmicsSurv.R'
    'accessClass_OmicsRegCateg.R'
    'createClass_OmicsCateg.R'
    'createClass_OmicsReg.R'
    'accessClass_OmicsPathData.R'
    'accessClass_pathwayCollection.R'
    'accessClass_pcOut.R'
    'aesPC_calculate_AESPCA.R'
    'aesPC_calculate_LARS.R'
    'aesPC_extract_OmicsPath_PCs.R'
    'aesPC_permtest_CoxPH.R'
    'aesPC_permtest_GLM.R'
    'aesPC_permtest_LM.R'
    'aesPC_unknown_matrixNorm.R'
    'aesPC_wrapper.R'
    'createOmics_All.R'
    'createOmics_CheckAssay.R'
    'createOmics_CheckPathwayCollection.R'
    'createOmics_CheckSampleIDs.R'
    'createOmics_JoinPhenoAssay.R'
    'createOmics_TrimPathwayCollection.R'
    'createOmics_Wrapper.R'
    'data_colonSubset.R'
    'data_genesetSubset.R'
    'data_wikipathways.R'
    'pathwayPCA.R'
    'printClass_Omics_All.R'
    'printClass_pathwayCollection.R'
    'superPC_model_CoxPH.R'
    'superPC_model_GLM.R'
    'superPC_model_LS.R'
    'superPC_model_tStats.R'
    'superPC_model_train.R'
    'superPC_modifiedSVD.R'
    'superPC_optimWeibullParams.R'
    'superPC_optimWeibull_pValues.R'
    'superPC_pathway_tControl.R'
    'superPC_pathway_tScores.R'
    'superPC_pathway_tValues.R'
    'superPC_permuteSamples.R'
    'superPC_wrapper.R'
    'utils_adjust_and_sort_pValues.R'
    'utils_load_test_data_onto_PCs.R'
    'utils_multtest_pvalues.R'
    'utils_read_gmt.R'
    'utils_transpose_assay.R'
    'utils_write_gmt.R'
VignetteBuilder: knitr
URL:
  https://gabrielodom.github.io/pathwayPCA/;
  https://github.com/gabrielodom/pathwayPCA
BugReports: https://github.com/gabrielodom/pathwayPCA/issues

Add SSH keys to your GitHub account. SSH keys will are used to control access to accepted Bioconductor packages. See these instructions to add SSH keys to your GitHub account.

bioc-issue-bot commented 5 years ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

mtmorgan commented 5 years ago

Please update your package to interoperate with (accept as inputs / return as outputs) standard Bioconductor data representations, especially SummarizedExperiment

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

9df8fbc Per Issue #64, this commit is the last of the chan...

gabrielodom commented 5 years ago

@mtmorgan Thank you for the comment. We have added a function to interface the SummarizedExperiment class with our pathway-based -Omics class.

To the reviewer: The two ERRORs I cannot fix are:

  1. System Files found that should not be git tracked: pathwayPCA.Rproj
  2. ERROR: Maintainer must subscribe to the bioc-devel mailing list. Subscribe here: https://stat.ethz.ch/mailman/listinfo/bioc-devel

For ERROR 1, the pathwayPCA.Rproj file is listed in .gitignore. I have used the BFG to remove the pathwayPCA.Rproj file from .git/. When I search for it in the .git folder, I am unable to find it. My team and I are researching other options.

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

lshep commented 5 years ago

Thank you - For the remaining ERRORS:

  1. It seems like this is remedied in the last build report.
  2. We require all package maintainers to sign up for the mailing list. The gabriel.odom@med.miami.edu must register on the mailing list. Please check spam folders for the final confirmation as sometimes it gets marked as spam.

I will perform a more thorough review of the package within the next week but It will be recommended to fix all WARNINGS/NOTES in the build report.

gabrielodom commented 5 years ago

Thank you @lshep. We are still working on the package install size and other warnings / notes. Our issue thread is https://github.com/gabrielodom/pathwayPCA/issues/64

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

3bd452b Per Issue #64, cleaning up the package. Version bu...

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

lshep commented 5 years ago

General

Build report

readme

Description

Namespace

news

inst/vignette_drafts/vignette_docx

inst/misc_scripts

inst/extdata

data

docs/testing_files

pathwayPCA.pdf

vignette Introduction_to_pathwayPCA

Could perhaps benefit from an accessor method for pVals? 

- [ ] You perform a modification 

ovOutGather_df <-

Take in the results data frame,

ovarian_aespcOut$pVals_df %>%

add the score variable,

mutate(score = -log(rawp)) %>%

subset the top 20 pathways, and

slice(1:20) %>%

remove the columns we don't need

select(-pathways, -n_tested, -rawp, -FDR_BH)


But it is unclear to me why?  Also couldn't this  be automated for the user
instead of manually run? 

**R code**

While there is a lot of it - generally seems well organized and well written. 
gabrielodom commented 5 years ago

@lshep, thank you. We are working to address these points.

lshep commented 5 years ago

Just wanted to check in regarding progress. The deadline for accepting new packages for this release has been announced as Wed. April 24th.

gabrielodom commented 5 years ago

Hi @lshep, we are in the process of re-submitting this afternoon or tomorrow morning. Thank you for checking up on us!

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

4eb8626 Package version bump to show completion of Biocond...

gabrielodom commented 5 years ago

title: "Response to Bioconductor Review" author: "Gabriel Odom, Lizhong Liu, James Ban, Lily Wang, and Steven Chen" date: "March 28, 2019" output: html_document

Introduction

We thank the Bioconductor review team, in particular Lori Shepherd, for their valuable time spent reviewing our package and for their insightful feedback. We have carefully thought about all comments and made many revisions accordingly in the revised package and documentations. We hope this resubmission address the reviewer’s concerns well.


General Comments

Comment:

We encourage the use of BiocParallel over parallel for parallelization implementation. Please explain why this was not used if not updating

Response: We appreciate your comment. During our package design phase, we did consider using the BiocParallel package due to its superb multi-machine/supercluster functionality and flexibility. However, after considerable thought, we chose to implement Snow-style parallel computing. We felt that the base parallel package was quite sufficient for our needs. Given that many of our package’s current users work in Windows environments, and that the pathwayPCA algorithm is not overly complex, we decided to use a simple single-machine Snow cluster. We have extensively tested our parallel computing setup on Windows, Macintosh, and Linux systems.

Comment:

Standard Biconductor objects and classes for data should be utilized whereever possible instead of creating new. Please explain why no existing structure could be used or extended.

Response: Thanks for your comment. In response, to make our package compatible with SummarizedExperiment objects, we have now added a function SE2Tidy() that can import a SummarizedExperiment object.

Comment:

Can this be integrated with other Bioconductor packages/input/output

Response: Yes. Our package now interfaces with the SummarizedExperiment package for assay data and the rWikiPathways package for pathway collections.


Build Report

Comment:

Normally require/library calls are not necessary in code if the proper packages (or select functions from packages) were imported in the NAMESPACE. Please investigate this WARNING

Response: This WARNING is triggered by loading a necessary package onto the Snow-style cluster suite. Per the Introduction to BiocParallel section 4.1.2 on clusters, this library() call would be required even with the BiocParallel package. I don't see a way around this.

Comment:

We strongly encourage units tests. They can be helpful in debugging code and to ensure the code is working as intended.

Response: This is an excellent suggestion. In response, we have added unit tests in the /tests/testthat/ directory, as suggested in the Bioconductor Unit Testing guidelines.


Top-Level Files

README.md

Comment:

Add Bioconductor installation instructions as well

Response: We have updated the README.md file, the GitHub README, and the package home page with Bioconductor installation instructions.

DESCRIPTION

Comment:

User Authors@R (recommended) or Author:/Maintiner: not both.

Response: We have removed the redundant "Maintainer" field.

NAMESPACE

Comment:

You have exported a lot of functions. Are all these functions intended to be used directly by the user or are some used internally only and be removed as an export?

Response: As you suggested, we have removed a few dozen internal functions from the NAMESPACE of our package. However, in order to keep the examples we wrote for these functions (to aid in future development), we have wrapped them in a \dontrun{} environment. This creates a new NOTE. Please disregard this NOTE.

NEWS.md

Comment:

Should have only one news file please remove NEWS (suggested removal) or NEWS.md (suggested keep)

Response: As you suggested, we have removed the NEWS file and kept the NEWS.md file.


inst/ Files

vignette_drafts/

Comment:

Does not seem like this should be included here and exposed to users. Please remove. Move any relevant vignettes to the vignette directory

Response: This directory has been removed.

vignette_docx/

Comment:

Does not seem like this should be included here and exposed to users. Please remove. Move any relevant vignettes to the vignette directory

Response: This directory has been removed.

misc_scripts/

Comment:

What are these scripts? Again are these relevant that they should be exposed to the user. Else remove

Response: This directory has been removed.

extdata/

Comment:

Only include data files in extdata. Create a new inst/scripts directory to contain scripts for generating the data. These scripts (either R/sudo code/or paragraph) should indicate the source of the data and how to replicate creation.

Response: As you suggested, we have moved all *.R files relevant to data generation to inst/scripts.


Top-Level Directories

docs/

Comment:

What are the purpose of these directories? They probably should not be git tracked? These directories will be 'ignored' by R CMD build but still cloned over. The desired state is closer to only package relevant files on the master/git repository.

Response: To clarify, the pkgdown website-builder utility package builds our package website and stores the files for this website in the docs/ directory.

testing_files/

Comment:

What are the purpose of these directories? They probably should not be git tracked? These directories will be 'ignored' by R CMD build but still cloned over. The desired state is closer to only package relevant files on the master/git repository.

Response (from Gabriel Odom): To clarify, I changed the directory from tests/ to testing_files/ so that R wouldn't run my tests every time I checked the package. I have now removed this directory from the package as you suggested.


Vignette Comments

Comment:

Include Bioconductor installation instructions

Response: We have added Bioconductor installation instructions to the "Introduction to PathwayPCA" and "Supplement 1 - Quickstart Guide" vignettes.

Comment:

If you created your own class, you show have accessor methods for common slots of interest to avoid direct access. eg. [1] "aespcOut" "pathwayPCA" "list" and ovarian_aespcOut$pVals_df. Could perhaps benefit from an accessor method for pVals?

Response: This is an excellent suggestion. As you suggested, we have added getPathpVals() as an accessor function for the table of $p$-values returned in objects of class aespcOut and superpcOut. The vignettes "Introduction to PathwayPCA", "Supplement 1 - Quickstart Guide", and "Supplement 4 - Methods Walkthrough" now all make use of this new accessor function.

Comment:

You perform a modification

ovOutGather_df <-
ovarian_aespcOut$pVals_df %>%
mutate(score = -log(rawp)) %>%
slice(1:20) %>% 
select(-pathways, -n_tested, -rawp, -FDR_BH)

But it is unclear to me why? Also couldn't this be automated for the user instead of manually run?

Response: As you suggested, we have implemented a new function that automates this modification. The new accessor function getPathpVals() has functionality to select the most significant $k$ pathways, transform the $p$-values, and remove the pathways and n_tested columns.


Miscellaneous

Data Documentation

Comment:

In the man pages for colonSurv_df.Rd please include the source information for data where this subset was taken from.

Response: We have included a citation of GEO GSE17538 in the documentation for the colonSurv_df data object.

PDF Manual

Comment:

Remove this from the top level directory (git rm)

Response: The file pathwayPCA.pdf has been removed from the package and from git.

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

30e27be Somehow lost the Suggests: testthat line.

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

lshep commented 5 years ago

Thank you. The changes look good. Thank you for such detailed documentation and vignette sections and adding the additional sections showing its interaction with other Bioconductor packages.

bioc-issue-bot commented 5 years ago

Your package has been accepted. It will be added to the Bioconductor Git repository and nightly builds. Additional information will be posed to this issue in the next several days.

Thank you for contributing to Bioconductor!

mtmorgan commented 5 years ago

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/gabrielodom.keys is not empty), then no further steps are required. Otherwise, do the following:

  1. Add an SSH key to your github account
  2. Submit your SSH key to Bioconductor

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("pathwayPCA"). The package 'landing page' will be created at

https://bioconductor.org/packages/pathwayPCA

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.