Closed gabrielodom closed 5 years ago
This thread will be the continuation of the work done in Issue #64.
Add Bioconductor installation instructions as well
User Authors@R (recommended) or Author:/Maintiner: not both.
You have exported a lot of functions. Are all these functions intended to be used directly by the user or are some used internally only and be removed as an export?
Should have only one news file please remove NEWS (suggested removal) or NEWS.md (suggested keep)
Does not seem like this should be included here and exposed to users. Please remove. Move any relevant vignettes to the vignette directory
What are these scripts? Again are these relevant that they should be exposed to the user. Else remove
Only include data files in extdata. Create a new inst/scripts directory to contain scripts for generating the data. These scripts (either R/sudo code/or paragraph) should indicate the source of the data and how to replicate creation.
In the man pages for colonSurv_df.Rd please include the source information for data where this subset was taken from.
What are the purpose of these directories? They probably should not be git tracked? These directories will be 'ignored' by R CMD build but still cloned over. The desired state is closer to only package relevant files on the master/git repository.
Remove this from the top level directory (git rm)
[1] "aespcOut" "pathwayPCA" "list"
and ovarian_aespcOut$pVals_df
. Could perhaps benefit from an accessor method for pVals?You perform a modification
ovOutGather_df <-
ovarian_aespcOut$pVals_df %>%
mutate(score = -log(rawp)) %>%
slice(1:20) %>%
select(-pathways, -n_tested, -rawp, -FDR_BH)
But it is unclear to me why? Also couldn't this be automated for the user instead of manually run?
While there is a lot of it - generally seems well organized and well written.
We thank the Bioconductor review team, in particular Lori Shepherd, for their valuable time spent reviewing our package and for their insightful feedback. We have carefully thought about all comments and made many revisions accordingly in the revised package and documentations. We hope this resubmission address the reviewer’s concerns well.
Comment:
We encourage the use of BiocParallel over parallel for parallelization implementation. Please explain why this was not used if not updating
Response: We appreciate your comment. During our package design phase, we did consider using the BiocParallel
package due to its superb multi-machine/supercluster functionality and flexibility. However, after considerable thought, we chose to implement Snow-style parallel computing. We felt that the base parallel
package was quite sufficient for our needs. Given that many of our package’s current users work in Windows environments, and that the pathwayPCA algorithm is not overly complex, we decided to use a simple single-machine Snow cluster. We have extensively tested our parallel computing setup on Windows, Macintosh, and Linux systems.
Comment:
Standard Biconductor objects and classes for data should be utilized whereever possible instead of creating new. Please explain why no existing structure could be used or extended.
Response: Thanks for your comment. In response, to make our package compatible with SummarizedExperiment
objects, we have now added a function SE2Tidy()
that can import a SummarizedExperiment
object.
Comment:
Can this be integrated with other Bioconductor packages/input/output
Response: Yes. Our package now interfaces with the SummarizedExperiment
package for assay data and the rWikiPathways
package for pathway collections.
Comment:
Normally require/library calls are not necessary in code if the proper packages (or select functions from packages) were imported in the NAMESPACE. Please investigate this WARNING
Response: This WARNING is triggered by loading a necessary package onto the Snow-style cluster suite. Per the Introduction to BiocParallel
section 4.1.2 on clusters, this library()
call would be required even with the BiocParallel
package. I don't see a way around this.
Comment:
We strongly encourage units tests. They can be helpful in debugging code and to ensure the code is working as intended.
Response: This is an excellent suggestion. In response, we have added unit tests in the /tests/testthat/
directory, as suggested in the Bioconductor Unit Testing guidelines.
Comment:
Add Bioconductor installation instructions as well
Response: We have updated the README.md file, the GitHub README, and the package home page with Bioconductor installation instructions.
Comment:
User Authors@R (recommended) or Author:/Maintiner: not both.
Response: We have removed the redundant "Maintainer" field.
Comment:
You have exported a lot of functions. Are all these functions intended to be used directly by the user or are some used internally only and be removed as an export?
Response: As you suggested, we have removed a few dozen internal functions from the NAMESPACE of our package. However, in order to keep the examples we wrote for these functions (to aid in future development), we have wrapped them in a \dontrun{}
environment. This creates a new NOTE. Please disregard this NOTE.
Comment:
Should have only one news file please remove NEWS (suggested removal) or NEWS.md (suggested keep)
Response: As you suggested, we have removed the NEWS file and kept the NEWS.md file.
Comment:
Does not seem like this should be included here and exposed to users. Please remove. Move any relevant vignettes to the vignette directory
Response: This directory has been removed.
Comment:
Does not seem like this should be included here and exposed to users. Please remove. Move any relevant vignettes to the vignette directory
Response: This directory has been removed.
Comment:
What are these scripts? Again are these relevant that they should be exposed to the user. Else remove
Response: This directory has been removed.
Comment:
Only include data files in extdata. Create a new inst/scripts directory to contain scripts for generating the data. These scripts (either R/sudo code/or paragraph) should indicate the source of the data and how to replicate creation.
Response: As you suggested, we have moved all *.R
files relevant to data generation to inst/scripts
.
Comment:
What are the purpose of these directories? They probably should not be git tracked? These directories will be 'ignored' by R CMD build but still cloned over. The desired state is closer to only package relevant files on the master/git repository.
Response: To clarify, the pkgdown
website-builder utility package builds our package website and stores the files for this website in the docs/
directory.
Comment:
What are the purpose of these directories? They probably should not be git tracked? These directories will be 'ignored' by R CMD build but still cloned over. The desired state is closer to only package relevant files on the master/git repository.
Response (from Gabriel Odom): To clarify, I changed the directory from tests/
to testing_files/
so that R wouldn't run my tests every time I checked the package. I have now removed this directory from the package as you suggested.
Comment:
Include Bioconductor installation instructions
Response: We have added Bioconductor installation instructions to the "Introduction to PathwayPCA" and "Supplement 1 - Quickstart Guide" vignettes.
Comment:
If you created your own class, you show have accessor methods for common slots of interest to avoid direct access. eg. [1] "aespcOut" "pathwayPCA" "list" and ovarian_aespcOut$pVals_df. Could perhaps benefit from an accessor method for pVals?
Response: This is an excellent suggestion. As you suggested, we have added getPathpVals()
as an accessor function for the table of $p$-values returned in objects of class aespcOut
and superpcOut
. The vignettes "Introduction to PathwayPCA", "Supplement 1 - Quickstart Guide", and "Supplement 4 - Methods Walkthrough" now all make use of this new accessor function.
Comment:
You perform a modification
ovOutGather_df <- ovarian_aespcOut$pVals_df %>% mutate(score = -log(rawp)) %>% slice(1:20) %>% select(-pathways, -n_tested, -rawp, -FDR_BH)
But it is unclear to me why? Also couldn't this be automated for the user instead of manually run?
Response: As you suggested, we have implemented a new function that automates this modification. The new accessor function getPathpVals()
has functionality to select the most significant $k$ pathways, transform the $p$-values, and remove the pathways
and n_tested
columns.
Comment:
In the man pages for colonSurv_df.Rd please include the source information for data where this subset was taken from.
Response: We have included a citation of GEO GSE17538 in the documentation for the colonSurv_df
data object.
Comment:
Remove this from the top level directory (git rm)
Response: The file pathwayPCA.pdf
has been removed from the package and from git.
We now have the comments from the Bioconductor review team: https://github.com/Bioconductor/Contributions/issues/1000