GSgalgoR: Identification and Study of Prognostic Gene Expression Signatures in Cancer

harpomaxx commented 4 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Repository: https://github.com/harpomaxx/GSgalgoR/

Confirm the following by editing each check box to '[x]'

[x] I understand that by submitting my package to Bioconductor, the package source and all review commentary are visible to the general public.
[x] I have read the Bioconductor Package Submission instructions. My package is consistent with the Bioconductor Package Guidelines.
[x] I understand that a minimum requirement for package acceptance is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS. Passing these checks does not result in automatic acceptance. The package will then undergo a formal review and recommendations for acceptance regarding other Bioconductor standards will be addressed.
[x] My package addresses statistical or bioinformatic issues related to the analysis and comprehension of high throughput genomic data.
[x] I am committed to the long-term maintenance of my package. This includes monitoring the support site for issues that users may have, subscribing to the bioc-devel mailing list to stay aware of developments in the Bioconductor community, responding promptly to requests for updates from the Core team in response to changes in R or underlying software.

I am familiar with the essential aspects of Bioconductor software management, including:

[x] The 'devel' branch for new packages and features.
[x] The stable 'release' branch, made available every six months, for bug fixes.
[x] Bioconductor version control using Git (optionally via GitHub).

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 4 years ago

Hi @harpomaxx

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: GSgalgoR
Type: Package
Title: An Evolutionary Framework for the Identification and Study of Prognostic Gene Expression Signatures in Cancer
Version: 0.99.0
Authors@R: c(person("Martin", "Guerrero", role = c("aut"),
           email = "mguerrero@conicet-mendoza.gob.ar"),
    person("Carlos", "Catania", role = "cre",email="harpomaxx@gmail.com"))
Author: Martin Guerrero [aut], Carlos Catania [cre]
Maintainer:  Carlos Catania <harpomaxx@gmail.com>
Description: A multi-objective optimization algorithm for disease sub-type discovery based on a non-dominated sorting genetic algorithm. 
   The 'Galgo' framework combines the advantages of clustering algorithms for grouping heterogeneous 'omics' data and the searching properties of 
   genetic algorithms for feature selection. The algorithm search for the  optimal number of clusters determination considering the features that 
   maximize the survival difference between sub-types while keeping cluster consistency high.
License: MIT + file LICENSE
biocViews: GeneExpression, Transcription, Clustering, Classification, Survival
Encoding: UTF-8
LazyData: true
Imports: cluster, doParallel, foreach, matchingR, nsga2R, survival, proxy, stats, methods,
Suggests:  knitr, rmarkdown, ggplot2,  BiocStyle, genefu, survcomp, Biobase, survminer, breastCancerTRANSBIG, breastCancerUPP, iC10TrainingData, pamr, testthat
URL: https://github.com/harpomaxx/GSgalgoR
BugReports: https://github.com/harpomaxx/GSgalgoR/issues
RoxygenNote: 7.1.0
VignetteBuilder: knitr

bioc-issue-bot commented 4 years ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/GSgalgoR to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 4 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 15b2ff978391815bd843c43db9128c5e43686e59

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/GSgalgoR to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 4 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 3b4a9378eb04e5accf13ca71fc0887bbdded8827

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/GSgalgoR to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

harpomaxx commented 4 years ago

Hello After bumping a push to trigger a new build, the bioc-issue-bot reported an unknown ERROR in merida1 host running macOS. Error: package or namespace load failed for BiocCheck in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called biocViews Execution halted Windows and Linux platform did not report the error. Since we don't have access to macOS, we are not sure how to proceed. Could you please point to some information on how to deal with the ERROR? Bests,

Carlos A. Catania (AKA Harpo)

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/GSgalgoR to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

mtmorgan commented 4 years ago

Sorry for my slow review.

DESCRIPTION / NAMESPACE / NEWS

Is the NEWS file parsed correctly by utils::news()?

vignettes

extensive vignettes, good!
please update the installation instructions to use BiocManager::install() as on package landing pages; you may also include github installation instructions if desired.
consider using message = FALSE in code chunks (e.g., with library() calls) where the output is unlikely to be informative.
Consder saving data under inst/data using saveRDS() so that one can simply, e.g., test <- readRDS(...)
Is there a reason why functions defined in the vignette, e.g., DropDuplicates(), are not instead defined in the package, so that the user does not need to copy and paste them from the vignette?
It looks like you are using Biobase::ExpressionSet to represent the data, but most modern packages use SummarizedExperiment

R (comments apply generally, not just to the specific lines mentioned)

callback-functions.R:94 consider providing userdir = tempdir() as the argument to this function, so that the argument default is self-explanatory to the user.
callback-functions.R:108 use file.path() for file path construction; e.g., this avoids the need to paste "/"on to tempdir().
cluster-classifier.R:37 if the size of R is large, consider using matrixStats::rowMaxs() instead of apply().
cluster-classifier.R:163 'hoist' common subexpressions outside loops. For instance, evalaute unlist(class) once rather than c times.
convert-functions.R:61 always use accessor functions rather than direct slot access. If not intended for end-users, define the accessors using a convention to indicate that they are not exported, e.g., .Solutions <- function(x) x@Solutions.
convert-functions.R:59 Avoid 'copy and append' patterns of list creation, where a zero-length list is extended in a for loop. Instead pre-allocate the result OUTPUT = vector("list", nrow(.Solutions(output))) or better leave list allocation to R using OUTPUT <- lapply(.Solutions(output), function(x) ...). Note also that this pattern reduces the number of calls to @ (once, instead of nrow(.Solutions(output))) and hence is more efficient.
distance-functions.R:87 adopt standard indenting
distance-functions.R:104 generally, message(paste("foo", x, "bar")) is redundant; use message("foo ", x, " bar")
galgo.R:84 this version of 'copy and append' (x <- c(x, x1)) in particular is very inefficient.
galgo.R:663 use on.exit(parallel::stopCluster(cluster)) to ensure that the cluster is stopped even if an error occurs before the end of the function.

man

good

mtmorgan commented 4 years ago

Please drop a brief comment here indicating that you are working on a revision. @harpomaxx

harpomaxx commented 4 years ago

Sorry. Sure. We are working on a revision of the package. We have already fixed most of the fundamental issues. We expect submitting the revised version anytime soon.

bioc-issue-bot commented 4 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 23bf6ef203b84e4fb96e17e8f8ec5c570a7f5f15

bioc-issue-bot commented 4 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/GSgalgoR to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

harpomaxx commented 4 years ago

Thanks for the deep revision of our package. We have tried to follow all your recommendations and suggestions

Is the NEWS file parsed correctly by utils::news()?

A: We have verified using utils:news() and the file was parsed correctly

vignettes

please update the installation instructions to use BiocManager::install() as on package landing pages; you may also include github installation instructions if desired.

A: We added the information for installing the package using BiocManager and included as an alternative the installation via devtools

consider usingmessage = FALSEin code chunks (e.g., with library() calls) where the output is unlikely to be informative.

A: We included message = FALSE when loading libraries in the vignettes. We didn’t find any other cases in both vignettes. Unless you recommend the removal of galgo() output.?

Consider saving data under inst/data using saveRDS() so that one can simply, e.g., test <- readRDS(...)

A: We considered using readRDS as an option. However, the large size of the saved object was outside the accepted size for bioc packages. Reducing the object size was not possible for a useful example in the vignettes

Is there a reason why functions defined in the vignette, e.g., DropDuplicates(), are not instead defined in the package, so that the user does not need to copy and paste them from the vignette?

A: The DropDuplicates() and expandProbesets() functions were intended specifically to the datasets included in the examples. Particularly to adapt the TRANSBIG and UPP datasets to perform PAM50 classification in the case study. In this regard, the datasets are from packages from others and the functions are not general and are not needed for galgo to work.

It looks like you are using Biobase::ExpressionSetto represent the data, but most modern packages use SummarizedExperiment

A: For the Vignettes we used two datasets included in the Bioconductor packages breastCancerUPP and breastCancerTRANSBIG. These packages are from other researchers and use the Biobase::ExpressionSetstructure. Galgo does not require this structure and both ExpressionSet and SummarizedExperiment data can be adapted to be used in our package.

R (comments apply generally, not just to the specific lines mentioned)

callback-functions.R:94 consider providing userdir = tempdir() as the argument to this function, so that the argument default is self-explanatory to the user.

A: all functions now havetempdir() as default value for usedir. Also the body of the function was modified accordingly.

callback-functions.R:108 usefile.path() for file path construction; e.g., this avoids the need to paste "/" on to tempdir().

A: Thanks for the suggestion. The file.path() function was used and pasting “/” was not required anymore

cluster-classifier.R:37 if the size of R is large, consider using matrixStats::rowMaxs()instead ofapply().

A: Interesting suggestion. We were not aware of this function, nevertheless, we did a benchmark of both options within the range of reasonable sizes of datasets expected and the differences in performance are negligible. Also, the function mentioned returns the max value of the row, while we need the position of the max value of the row, which is also an inconvenience for using the suggested function.

cluster-classifier.R:163 'hoist' common subexpressions outside loops. For instance, evaluateunlist(class)once rather than c times.

A: We did our best for finding and removing all the subexpressions outside loops.

convert-functions.R:61always use accessor functions rather than direct slot access. If not intended for end-users, define the accessors using a convention to indicate that they are not exported, e.g., .Solutions <- function(x) x@Solutions.

A: We wanted the end-users to have access to galgo.Obj slots.
Therefore, we implemented generics methods for both slots ParetoFront and Solutions.

convert-functions.R:59Avoid 'copy and append' patterns of list creation, where a zero-length list is extended in a for loop. Instead pre-allocate the resultOUTPUT = vector("list", nrow(.Solutions(output))) or better leave list allocation to R using OUTPUT <- lapply(.Solutions(output), function(x) ...). Note also that this pattern reduces the number of calls to @ (once, instead of nrow(.Solutions(output))) and hence is more efficient.

A: We have pre-allocated all lists in convert-functions.R and as well as in other source files such as cluster-classifier.Rand results-functions.R

distance-functions.R:87 adopt standard indenting

A: We have fixed all the lines with incorrect indentation.

distance-functions.R:104 generally, message(paste("foo", x, "bar"))is redundant; use message("foo ", x, " bar")

A: Thanks for the suggestion. We weren’t aware of the behavior ofmessage().
The invocation to paste()was removed in all calls to message()

galgo.R:84 this version of 'copy and append'(x <- c(x, x1)) in particular is very inefficient.

A: We have fixed the antipattern for adding elements to a vector pre allocating the memory. The modification was made not only in the section you mentioned above but also in other sections of the packages.

galgo.R:663 use on.exit(parallel::stopCluster(cluster)) to ensure that the cluster is stopped even if an error occurs before the end of the function.

A: We added the use of on.exit()to ensure the cluster stopped. However, it seems that the return value of galgo() is lost and as a consequence, all tests fail. We were able to solve the problem by using return at the end ofgalgo()instead of using the returning value of the last expression.

bioc-issue-bot commented 4 years ago

Your package has been accepted. It will be added to the Bioconductor nightly builds.

Thank you for contributing to Bioconductor!

mtmorgan commented 4 years ago

The master branch of your GitHub repository has been added to Bioconductor's git repository.

To use the git.bioconductor.org repository, we need an 'ssh' key to associate with your github user name. If your GitHub account already has ssh public keys (https://github.com/harpomaxx.keys is not empty), then no further steps are required. Otherwise, do the following:

See further instructions at

https://bioconductor.org/developers/how-to/git/

for working with this repository. See especially

https://bioconductor.org/developers/how-to/git/new-package-workflow/ https://bioconductor.org/developers/how-to/git/sync-existing-repositories/

to keep your GitHub and Bioconductor repositories in sync.

Your package will be included in the next nigthly 'devel' build (check-out from git at about 6 pm Eastern; build completion around 2pm Eastern the next day) at

https://bioconductor.org/checkResults/

(Builds sometimes fail, so ensure that the date stamps on the main landing page are consistent with the addition of your package). Once the package builds successfully, you package will be available for download in the 'Devel' version of Bioconductor using BiocManager::install("GSgalgoR"). The package 'landing page' will be created at

https://bioconductor.org/packages/GSgalgoR

If you have any questions, please contact the bioc-devel mailing list (https://stat.ethz.ch/mailman/listinfo/bioc-devel); this issue will not be monitored further.

Bioconductor / Contributions

GSgalgoR: Identification and Study of Prognostic Gene Expression Signatures in Cancer #1616

vignettes

R (comments apply generally, not just to the specific lines mentioned)